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Preface 


The present notes are the outcome of lectures I gave at Columbia University in the 
fall of 1987, and at the University of Colorado 1988/1989. Although there is necessarily 
some overlap with my earlier Lecture Notes on Diophantine Approximation (Springer 
Lecture Notes 785, 1980), this overlap is small. In general, whereas in the earlier Notes 
I gave a systematic exposition with all the proofs, the present notes present a varirety 
of topics, and sometimes quote from the literature wihtout giving proofs. Nevertheless, 
I believe that the pace is again leisurely. 


Chapter I contains a fairly thorough discussion of Siegel’s Lemma and of heights. 
Chapter II is devoted to Roth’s Theorem. Rather than Roth’s Lemma, I use a general- 
ization of Dyson’s Lemma as given by Esnault and Viehweg. A proof of this generalized 
lemma is not given; it is beyond the scope of the present notes. An advantage of the 
lemma is that it leads to new bounds on the number of exceptional approximations in 
Roth’s Theorem, as given recently by Bombieri and Van der Poorten. These bounds 
turn out to be best possible in some sense. Chapter III deals with the Thue equation. 
Among the recent developments are bounds by Bombieri and author on the number 
of solutions of such equations, and by Mueller and the author on the number of so- 
lutions of Thue equations with few nonzero coefficients, say s such coefficients (apart 
from the constant term). I give a proof of the former, but deal with the latter only 
up to s = 3, i.e., to trinomial Thue equations. Chapter IV is about S-unit equations 
and hyperelliptic equations. S-unit equations include equations such as 27 + 3¥ = 47. I 
present Evertse’s remarkable bounds for such equations. As for elliptic and hyperelliptic 
equations, I mention a few basic facts, often without proofs, and proceed to counting 
the number of solutions as in recent works of Evertse, and of Silverman, where the con- 
nection with the Mordell—Weil rank is explored. Chapter V is on certain diophantine 
equations in more than two variables. A tool here is my Subspace Theorem, of which I 
quote several versions, but without proofs. I study generalized S-unit equations, such 
as, e.g. tay taj? +... +a = 0 with given integers a; > 1, as well as norm form 
equations. Recent advances permit to give explicit estimates on the number of solutions. 
The notes end with an Epilogue on the abc-conjecture of Oesterlé and Masser. 


Hand written notes of my lectures were taken at Columbia University by Mr. Ag- 
boola, and at the University of Colorado by Ms. Deanna Caveny. The manuscript was 
typed by Ms. Andrea Hennessy and Ms. Elizabeth Stimmel. My thanks are due to 
them. 


January 1991 Wolfgang M. Schmidt 
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I. Siegel’s Lemma and Heights 
§1. Siegel’s Lemma. 
Consider a system of homogeneous linear equations 


4432, + +++ + Gintn =9 


(1.1) 
Omit, +++: + Amntn = 0 


If m < n and the coefficients lie in a field, then there is a nontrivial solution with 
components in the field. If m < n and the coefficients lie in Z (the integers), then there 
is a nontrivial solution in integers. (Just take a solution with rational components and 
multiply by the common denominator.) It is reasonable to believe that if the coefficients 
are small integers, then there will also be a solution in small integers. This idea was 
used by A. Thue (1909) and formalized by Siegel in (1929; on p. 213 of his Collected 
Works). 


LEMMA 1. Suppose that in (1.1) the coefficients a;; lie in Z and have |a;;| £ A 


(1 S i,j Ê n) where A is natural. Then there is a nontrivial solution in Z with 


lzi| <14+(nAy™/(r-™ (i=1,...,n). 


Proof. We follow Siegel. Let H be an integer parameter to be specified later. Let 
C be the cube consisting of points 


with 
lzi] $ H islen): 


There are (2H + 1)” integer points in this cube, since there are 2H + 1 possibilities for 
each coordinate. Let T be the linear map R” — R™ with 


Tz = (44121 +++ + Qintn,.-. amti t'e amnTn). 


Writing Tz = y = (y1,.-. Ym), we observe that the image of C lies in the cube 


C': |y;| £nAH (j =1,...,m) 
of R™. The number of integer points in C’ is (2nAH + 1)™ Suppose now that 
(2nAH +1)” < (2H 41)”. (1.2) 


Then T restricted to the integers in the original cube will not be one-to-one. So there 
exist z',£" in C, with z' # z", such that Tz’ = Tz”. Put z = gz! — z”. Then Tz =Q. 
So z is a solution to the system with integer coordinates. Note that |z;| £ 2H (i = 
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1,... ,n), because |x‘, |z/| S H (@=1,... ,n), so [zi] = |z; — zf] S fot] + [ef] S 2H. 
Choose H to be the natural number satisfying 
[ma TUE a A eI, 
Then 
(2H +1)" = (2H +1)” (2H +1)°7™ 
> (2H + 1)” (n A)” 
> (2nAH +1)”. 
So there exists an z satisfying |z;| £ 2H <1+ (nA) Cm), 
So the proof of Siegel’s Lemma uses the box principle. 
Can the exponent be improved? The answer is “no”. 


Siegel’s Lemma is almost best possible. Put k = n—m, and for large P pick distinct 
primes p;; (i Si S$ k,1 £7 $m) with Pn < pi; < P, where 0 < ņ < 1 is given. Put 


P; = pijpje pki; (1 Sj Lm), Pi; = Ploy. O24 Sk, VE Sm), 


Qi = papas Pim (1 Si Sk),  Qī=Qi/p; Se Se S57 Sm), 
Consider the system of m equations in n variables: 
Pazi +--+ + Prize — Pin =0 
Pizzi + +++ + Pkotk —P2y2 0 
Pymt, +: + Pemre —Pnym = 9. 


The maximum modulus A of its coefficients has A < P*. The following are solution 


vectors: 
4, = (Q:1,0,... 0, Q11, Qi2,--- Qim) 


Žk = (0,0,... Qk, Qe, Qkas--- Qem) 


It is clear that every solution of our system of equations is a linear combination 
cyz, +--+ + ckz For integer solutions, c; is necessarily a rational number whose 
denominator is Q;, and then every component of ciz; has denominator Q;. Moreover, 
if, say, cy is not integral, then (since Qi1, Q12,.--. , Qim are coprime), cız, is not integral 
and has some component whose denominator is a prime p;;. But since C2Z +++ ChE, 
don’t have pi; in the denominator, e1z, + coz, +i + ekz, cannot be integral—a 
contradiction. Therefore the integer solutions are Z= CZ, +++ + eZ, with cy,... ,Ck 
in Z. When z # 0, say cı # 0, the first component z; of z has 


|r| 2 Qı > (nP)™ = 7™p™ > qe A™k = nr Are). 
Therefore every integer solution (z1,.-., Zk, Y1,--- Ym) Æ (0,... ,0) has 


max(|z1|,...,]ax|) > nr A™/@-™), 


Here 7 may be taken arbitrarily close to 1. 
Another approach is as follows. When m = n — 1, consider the system of equations 


Az; — 2441 =0 (@=1,...,n—-1). 


Every nontrivial solution, in fact every nontrivial complex solution, has r,/z; = A"7!. 
Thus if we set 


q(z) = max|z;/z;|, 
with the maximum over i,j in 1 £7, j £nwith z; #0, then q(z) = A”! = Ar 
But then for integer solutions, max({z3|,... ,|zn|) 2 A™/("-™. 
Exercise la. Suppose now that m = 1. For large A, construct an equation 
@2,+::-+a,rz, = 0 


with integral coefficients and |a;| £ A (i =1,... ,n), such that every nontrivial solution 
g with complex components has 


q(x) 2 e(n)AVO-) = e(njm)AT/(—-™ > 0, 


This approach can be carried out for general n,m. See Schmidt (1985). 


§2. Geometry of Numbers. 


The subject was founded by Minkowski (1896 & 1910). Other references are Cassels 
(1959), Gruber and Lekkerkerker (1987), and Schmidt (1980, Chapter IV). 

A lattice A is a subgroup of R” which is generated by n linearly independent vectors 
b,,-.. 8, (linearly independent over R”). The elements of this lattice are c1b, + +--+ 
Enb, with c; E€ Z. 


L ey 
e Aa 
a ee aes 


SS a 


The set b pete b, is called a basis. A basis is not uniquely determined. For example, 
bob +b, b, yaa sbn is another basis. 
How unique is a basis? Suppose b',... ,b' is another basis. Then 
=1? =n 


b; = J cib, and cj €Z 


j=1 


and E 
b, = X cib, and Chi EZ. 
i=] 


So the matrices (cij) and (c};) are inverse to each other and cij, c}; € Z, so det (cij) = 
det (cj;) = +1. Thus the matrix (cij) is unimodular, where by definition a unimodular 
matrix is a square matrix with integer entries and determinant 1 or —1. 


LEMMA 2A. A necessary and sufficient condition for a subset A of R” to be a 

lattice is the following: 
(i) A is a group under addition. 

(ü) A contains n linearly independent vectors. 
(iii) A is discrete. 

For a proof, see e.g., Schmidt (1980, Ch. IV, Theorem 8A). 

Consider R” with the Euclidean metric and A a lattice with b,... ,b as basis. Let 
II be the set of linear combinations Md, ++ And, with 0 £ A; <1 (¢=1,...,n). 
Then II is called a fundamental parallelepiped of A. 


The fundamental parallelepiped does depend on which basis is chosen. The volume 
of I is given by V(II) = | det (,,... ‚b )| where the right-hand side involves the matrix 
whose rows are respectively made up of the coordinates of b, ,--. 50 with respect to an 
orthonormal bases of R”. This volume is independent of the chosen basis of the lattice, 
since different bases are connected by unimodular transformations. It is an invariant of 
the lattice. 


We define 
det A = V (TI). 
Notice that when b, = (bi, ... bin), then 
bi biz © bin ba ba © bm 
b b snake b A b b kri bn 
y? aed A 22 2 det = 22 2 
bat bnz ENE ban bin bon na ban 
ree ah TT 
St mits =2=2 =2=n (2.1) 
Zn=l 6b baln 


where the inner product of vectors z, y is denoted by zy. 
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Every z in R” may uniquely be written as z = z' + z” where z' € II anda" € A. 


z=) tib = J A6) + > lel, 


Here we used the notation that uniquely 


= [€] + {6} 


where [€] is an integer, called the integer part of €, and {€} satisfies 0 < {€} < 1 and is 
called the fractional part of €. 


rma 
Z” is a lattice with basis e ,... ,e where e, = (0,... ,0,1,0,... ,0), (@=1,... ,n), 
and with det Z” = 1. If A is an arbitrary lattice with ee b b> b» then there: exists 


a linear transformation T such that Te, = b, (i = 1,... n). So TE = 
Is T unique? Suppose TZ” = T'Z". Then (T'Y ITZ” = Z", so det ((T')-1T) = +1 
and (T')7'T is unimodular. Call it U. Then T = T'U. Observe that 


det A = | det T]. 


THEOREM 2B. (Minkowski’s First Theorem on Convex Sets.) Let B C R” be 
a convex set which is symmetric about the origin (i.e., z € B if and only if —z € B) of 
volume 


V(B) > 2" det A (2.2) 


where A is a lattice. Then B contains a non-zero lattice point. 


Comments. The volume here is the Jordan volume, i.e., the Riemann integral over 
the characteristic function of the set. Every bounded convex set has such a volume. Let 
gE B be a non-zero lattice point in B. Then -g # 0 and -g € B by symmetry, so B 
contains actually at least two non-zero lattice points. 


Proof. (Mordell, 1934). V(B)/det A is invariant under non-singular linear maps. 
Therefore, after applying a linear transformation, we may assume that A = Z”. Then 
the theorem reduces to: If V(B) > 2", then B contains a non-zero integer point. Let 
Bm be the set of rational points in B with common denominator m. Then 

[Bra — V(B) as m> œ 
m” 
where | | denotes the cardinality. For m sufficiently large, |B,|/m" > 2” and thus 
|Bm| > (2m)". So there are two points a = (a;/m,... ,an/m), b = (b1/m,... ,bn/m) 
in Bm with 7 7 
a; = b; (mod 2m) (i=1,... n). 


Then i 
z7) EB 
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since the midpoint of a and —b is 3(a-8) € B. Let g= 3(a—b). Clearly g is a non-zero 


integer point. 


Exercise 2a. If B is symmetric, convex, and V(B) > 2"k det A where k € N, then 
B contains at least k pairs of non-zero lattice points. 

A convez body is a compact, convex set containing 0 as an interior point. Such a 
body clearly has 0 < V(B) < oo. 7 


Remark 2C. If B is a symmetric, convex body and V(B) 2 2" det A, then B 
contains a non-zero lattice point. It is easy to show that 2C follows from 2B, and vice 
versa. 


Remark 2D. Theorem 2B is best possible. Take A = Z”, B the cube with |z,| < 1, 
(t= 1,...,n). Then V(B) = 2” = 2" det A and there are no non-zero integer points in 


Minkowski defines successive minima as follows: Given B, A where B is symmetric, 
convex, bounded, and with 0 in its interior, let A, = inf {A : ÀB contains a non-zero lattice 
point}. * More generally, for1 £7 < n, A; =inf{A: AB contains j linearly independent 
lattice points}. Then 


0 <à Í à < Às $ maA Í n< œ. 
Here \; > 0 since B is bounded and A, < œ since 0 is an interior point. 


THEOREM 2E. (Minkowski’s Second Theorem on Convex Bodies.) 


2" det A Í Me An V(B) $ 2" det A. (2.3) 
n: 


Example. Let n = 2, A = Z? and B the rectangle |zı| Í k, |z2| £ 1/k where 
k 2 1. Then à; =1/k, àz = k and 4\y\2V(B) = V(B) = 4 = 2? det A. 

A proof will not be given here. There is no really simple proof of the upper bound 
in (2.3). A weaker bound is proved in Schmidt (1980, p. 88). 


Remark 2F. The constants 2"/n! and 2” are best possible. Let A = Z” and B the 
cube |z;| 1. Then V(B) = 2”. Now £+- £, are on the boundary, so 4; = ++: = 
An = 1. We have \;---A,V(B) = 2" and 2” det A = 2". So we get equality on the 
right-hand side of (2.3). Next, let A = Z” and B the “octahedron” |z,|+---+|zn| £1. 


It may be seen that V(B) = 2"/n!. We have again Ay = Az = -> = Àn = 1. Thus 
(2"/n!) det A = 2"/n! and 1 ---4,V(B) = 2"/n!. We have equality on the left-hand 
side of (2.3). 
Note that 
ATV(B) £ 2” det A (2.4) 


"XB is the set of points Az with z € B. 
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since À} Í Ag S +--+ [Š An. Suppose now that V(B) > 2” det A. Then 1 < 1 so 
that B = 1B D A,B contains a non-zero lattice point. Therefore (2.4) implies 2B. It is 
easily seen that (2.4) and 2B are equivalent. 


Exercise 2b. Suppose B is a symmetric, convex set, A a lattice, 41 the first 
minimum. Given p > 0, the number of lattice points in uB is S$ ((2u/41) +1)”. 


§3. Lattice Packings. 


A good reference for general packing and covering problems is C. A. Rogers (1964). 
Let B C R” be compact and measurable. Given a lattice A, write B+ A= {b+ £: 
be B, LEA} 


If the translates of B by vectors of A are disjoint (as in the picture), then we call 
B +A a lattice packing of B. Having disjoint translates is equivalent to having unique 
representations of the vectors of B+ A as b + £. 

The density of such a lattice packing is 


V(A + B,r) 


§(B,A) = lim ~T (3.1) 


where V(r) is the volume of a ball of radius r, and V(A + B,r) is the volume of the 
intersection of A + B with the ball of radius r and center 0. 


Exercise 3a. Show that the limit (3.1) always exists under our hypotheses, and 
that (with II a fundamental parallelepiped) 


6(B, A) = V(I N (A + B))/V (II) = V(B)/ det A. 


We define 
6(B)= sup 4(B,A). 


A+B aa lattice 
packing 


This is invariant under nonsingular linear transformations T, since 6(T B, TA) = 6(B, A). 
Suppose B is convex and symmetric about 0. Suppose B contains no non-zero 
lattice point. 


8 
Claim: Then A + 4B is a lattice packing. 


£, E A, b, b, € B. Then by 
an argument used above, 38 — 38, isin BNA. But AA — 3b =£, —£ and B contains 
no non-zero lattice points, so that L -4 =0,£,= £ hence bi = b,- Therefore A + iB 
is indeed a lattice packing. 


Justification: Suppose A + AA = £ + $b, where £, 


1 V(4B) V(B) < 1 < 
ON ia dee a E 


and therefore 


V(B) $ 2"6(B) det A. 


THEOREM 3A. If B is convex and symmetric about 0 and V(B) > 2”6(B) det A, 
then B contains a non-zero lattice point. ~ 

In particular, this happens when V(B) > 2" det A. So 3A is a strengthening of 
Minkowski’s Theorem 2B. 


Remark 3B. For B convex, symmetric about 0, one can show that 6(B) < 1 
except for certain polyhedra. E.g., the cube has density 6 = 1. So do regular hexagons 
in the plane. 


ae ee ap 


Let 6, = 6(B) where B is a ball in R”. Consider the following picture. 


The “triangle” lattice A has det A = 4 V3. It is easy to guess that 


det A 23 
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This had already been proved implicitly by Gauss in his theory of positive definite binary 
quadratic forms. We know the values of 62, 63,... , 6g; see Cassels (1959, Appendix) and 
Exercise 3b below. The estimation of 6, for large n remains among the central unsolved 
problems in the Geometry of Numbers. Blichfeldt (1929) proved that 6, S$ 27/?(1+8). 
Also, n 2 1 — €)” if n > no(e). See Cassels (1959, p. 249). More recently, G.A. 
Kabatjanskii and V.I. Levenšteïn (1978) have shown that 6, < 270-599r(1-°) for n > 
no(E). 

One may in a fairly obvious way define a general (not necessarily lattice) packing 
of a set B, and the maximum general packing density. For a disk in R?, Thue (1892) 
had shown that the maximum packing density is in fact achieved for a lattice-packing. 
It is not known whether a similar result holds for a ball in R*. It is generally believed 
that the densest packing density of a ball in R” where n is sufficiently large is less than 
the smallest lattice packing density. 

Now V(B)\? < 2" det A6(B), so that Ay £ 2(det A)!/"(6(B)/V(B))'/". For the 
unit ball B, V(B) = V(n) = */?/T(1 + £), so that by Stirling’s formula we have the 
asymptotic relation 


V(ny/" = V(BPF ~ ae as n —> o. 
We define Hermite’s constant yn to be least such that for any lattice A 
dM Í ytl (det A)!" 
where à; = À; (unit ball). So 


< 45n" < 2n 


"= Vn if n>2. 


y 


We have lim(yn/n) $ 34; = #4, by using the trivial estimate 6, S$ 1. If instead we 
use Blichfeldt’s estimate, we obtain lim(yn/n) $ 4. The result of Kabatjanskii and 


Levenitein quoted above yields lim(yn/n) £ 27°19" (re). 
Exercise 3b. Show that yn = 460/"/V(n)2/". 


Exercise 3c. Let Q(X) = pee aijXiX; be a positive definite quadratic form 
with real coefficients aj; = aji. Then there exists a non-zero integer point z with 
Q(z) £ yn(det Q)'/"_ Moreover, yn is least with this property. 


§4. Siegel’s Lemma Again. 


A rational subspace of R” or C” is a subspace spanned by vectors with rational 
coordinates. A rational subspace S* of dimension k is spanned by k vectors aeo’ E 
Q”. Such a space S* may be defined by n—k linear homogeneous equations with rational 


coefficients. The integer points in a subspace S* form a set A which is a lattice of S* 
by Lemma 2A. 
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The height of S* is defined by 
H(S*) = det A. 


An integer point a = (a1,... ,@n) is called primitive if gcd(ai,... ,@n) = 1. Then 
a is “closest to the origin,” i.e., there is no integer point on the line segment between 
0 and a. Either a or —g is a basis of the 1-dimensional lattice of integer points of the 
space 5! spanned by a. Thus H(S') = |al. 


1o 


By the definition of Hermite’s constant, there is on S* an integer point z #0 with 


izl $ n HSY. 


LEMMA 4A. Consider the unit cube C in R”, i.e., |z;| £1 (i = 1,...,n). Let 
S¥ be a k-dimensional subspace. Then C N S* has k-dimensional volume 2 2*. 

This result, due to J. Vaaler (1979), will not be proved here. 

Let S* be a rational subspace, A the lattice of integer points associated with S*, 
i.e., A = A(S*). Let B= CN S*. By Minkowski’s Theorem 2C, \¥V(B) < 2* det A. 
Now V(B) 2 2* so that 
2* det A, 


det A = H(S*), 
H(S*)'/F, 
Recall that |z| was the Euclidean norm. Let 


> 

L or 

> N 

= gp Cead 
WA WA WA 


> 
“ 


|z| = max(|z],.-. , |zn|) 
be the maximum norm. Our results may be summarized in 
LEMMA 4B. Given a rational subspace S* there is an integer point z # Q on Ss 


with 
lz) Sy HS). 


Also, there is such a point z' with 


When S* is a rational subspace, then (S*)+ is a rational subspace of dimension 
m=n-—k. 


LEMMA 4C. H(S+) = H(S). 
The proof is postponed to the next section (see Corollary 5J). To make the lemma 
correct for S° and S” = R”, we set H(S°) = 1. 
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Let us go back to a system of linear equations 


011%, +---+ aintn = 9 


amiT +++ +4mntn = 0 


where 0 < m <n, aij E€ Z. If these equations are independent, they define a rational 
subspace S* of dimension k = n — m. And (S*)+ is spanned by the row vectors 


Qir- 5G where a. = (aj1,... , Qin). Then 
ky kL\ < aii 
H(S*) = H(S"~) & lal lend- 
The last inequality occurs because Qeg, can be written as linear combinations 
of basis vectors for the lattice, so that det |a pro a | is an integer multiple of the 


determinant of a basis.* Thus 
det (A(S**)) < |det(a,,...,¢)| £ la, l- la] 
by Hadamard’s inequality, which is a consequence of Lemma 5E below. 


LEMMA 4D. (Siegel’s Lemma) Given the system of equations above, 
(i) there is a non-trivial integer solution z with 


HA 


A 9 1/2 
malal lep DE $ (Ža) (va are- 


izl 


if |ai;| S A for every i,j. 
(ii) Also, there is a non-trivial integer solution z! with 


|x| = (dal Ia, NYC < (Jn Ariam), 


In the first inequality we used that yn-m < (n —- m) < ĉn if n-m 2 2, and 


Yn—-m =1< 2n ifn —m =1, so that n 2 2. It is clear that we do not have to restrict 
to the case when the m equations are independent. 


Remark 4E. If Minkowski’s Second Theorem (2E) is used, (ii) can be strengthened 
to get the following: there are n— m linearly independent solutions x of our 


Cy nam 
system of equations such that 


<| 


ERIEN MEI: AN a CA EA 


The first assertion can be strengthened in the same way, but this is not so obvious. 


§5. Grassman Algebra. 


tWe think of a yo @,, as vectors with m components in terms of an orthonormal coordinate 


system in S*+ 
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(Also presented in Schmidt (1980), Ch. IV.) Notation: Let K bea field, K” a vector 
space, and e. = (0,... ,0,1,0,... ,0) be basis vectors. Suppose 0 Š p S$ n. Let C(n,p) 
= Y 4 


i 
be the set of p-tuples o = {i1,... ,ip} with i; € Z and 1 Ŝi <ig <- <ip Ŝn. 


This set has cardinality 
n 
C(n, p)| = ) ; 
IC(n, p)| ( A 


E =e. Ne. eNe.. 
Éi, 


=e Ši 


Let E be the formal expression 


=tp 


There are ee such expressions. For p = 0, let E, = 1. Let Ky be the vector space 
over K generated by E with o € C(n,p). Then dim (Kp) = oa Elements of Ky are 
called p—vectors. 


Special cases: K? = K”, Kj = K, and K? is spanned by the single vector 
£ Ae Me Aen 

We now introduce a more flexible notation. For any p and any integers ii,... ,%p 
between 1 and n, the symbol Be Ae should be 0 if i; = ij for some j £ j'. 
Otherwise, if {i1,... ,2p} = {71,... ,jp}, (considered as unordered sets), where jı < 
j2 < ++: < jp, then 
e. Avs-Ae, = te, Av: Ae. 
=u =p =n =Jp 
with the + sign if we get the 2’s from the j’s by an even permutation, and with the — 
sign otherwise. 

Set 

G,=Kj OKTO- 0K? 

Then dim Gn = (5) + (7) +-°°+ (2) = 2". 

We are going to make G,, into an algebra over K. We need a product, or wedge A. 
By linearity, it suffices to define products of the basis vectors E We set 


1A1=1, 

1A(e, Av Ag, ae Av Ne, , 
(e. A---Ae, )Al=e. Av: Ae, 
5u =tp =t =tp 


(e. A---Ae. )A(e. Ae Ae. )=e. Aeee, Ae. eNe.. 
=t =tp =)1 =J4 =i =p =n “Iq 


Initially, this is given for 2; < --- < ip and jı < ++- jq, but clearly it remains true in 
general. We have e, A £,=—2,Ne. This algebra is associative. Note that this fits in 
with the original notation E Av: Ae, . The resulting algebra is called the Grassman 
TA =p 
algebra, or ezterior algebra. If Zire BE K”, then 
z A---Ax € Kp. 
=p 


=1 


Such a p—vector is called decomposable. 
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LEMMA 5A. Suppose z, = (:1,.-. sin) = jar Eije, (j=1,...,p). Then 
ZA Ag = > eE, 
o€C(n,p) 


where €, is the p x p determinant |€;;| with 1 <i < p, j eo. 
For example, when n = 3, p = 2, 


Ag, = én é Enu éi €i2 E13 f 
~ fén €22 | =12 fén Ez | =13 éz bag | =23 
The linear map with Zar E,» Es H =e, E P £ identifies K} with K?’ and the 
wedge produced with the cross product. 
When n = 4,p = 2 
fu é fis E14 
zz s3 , 
21°22" |ia bag | S12 as E23 24 | =34 
which has six terms. 
When p =n, 
ĉii tin 
Z^ SALS . E3 n 
Eni oe Enn 
Proof. The left-hand side is linear in each vector Z; the right-hand side is linear, 
too. So it suffices to consider the case when Ziyi BE {e> TE TE If two of the 
Z, ’s are the same, then both sides vanish. So without loss of generality Z = gp 


(i =1,...,p) with j; € {1,... ,n} distinct. Since both sides behave in obvious ways 

under permutations of vectors, we may suppose jı < j2 <-:: < jp. Then the left-hand 

side is E, _ = E where 7 = {j1,... jp}. On the right-hand side €, = 1 if o = 7, 
age P 


but s = 0 if o #7, since €, is the determinant of the submatrix of (€;;) with columns 
Jis tee sp- 


A consequence of Lemma 5A is Laplace’s expansion of a determinant after a set 
of rows. For simplicity we will deal only with expansion after the first p rows. Given 
p,q with p +q = n, and given o € C(n,p), let Z € C(n,q) be the complement of o in 
{1,... n}. Let e(o,) be 1 or —1, depending on whether (¢,@) is an even or an odd 
permutation of {1,... n}. Let Boop oes Y be in K”, and write 


“U n ghee: R 


a€C(n,p) = rEC(n,q) 
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Then 
ENA AY Nene DTD rE AE, 
vEC(n,p) reC(n,q) 
2 Cota E, A E, 
o€C(n,p) 
= > (0,7 )éons E,, 
aEC(n,p) 
By Lemma 5A, 
zy Av Az Ny RN Be (det M) E, 
where M is the matrix with rows Zpet Yeot We therefore have 
= a =. =q 


LEMMA BB. (Laplace expansion of a determinant.) 


det M = 5y elo, )Eona. 


aEC(n,p) 


Note that by Lemma 5A the €, are the (p x p)-determinants from the rows 
Zoot, of M, and the ng are the (q x q)-determinants from the complementary 
TOWS Y j++ Y > 

= =q 


LEMMA 5C. z,A--: Az, = 0 if and only ifz,,... „2, are linearly dependent. 


Proof. It is an immediate consequence of Lemma 5A. 


LEMMA 5D. Fa „2, are linearly independent and You £ are linearly 
independent, then LACA 2 is proportional to EA AeA 2 if a md an if x prod, 
and YoY span the same subspace of K”. 

= =p 

Proof. If the z’s and y ’s span the same subspace, then each Y, (i = 1,...,p) 
is a linear combination of z prt £5» 80 that Yor od is a multiple « of ZA A Zy 


The factor is the delr of the E E matrix for the Y, ’s in mae of the 
z, ’s. Conversely, suppose that x paren z= Aly AeA y ). For any z, the vector 
Zz Zz zZ Y 


ERN (z AeA z,) SZAL AGA z, is zero precisely when z lies in the space spanned 


by z,, Eo »z,- But y. ALAN Ag) =A, Ay A: Ay) = 0 since two y’ s occur. 
So Y, is in the space apatinei by z, nez = L.. . P). Thiereloré the spaces are the 
same. 3 

Let SP C K” be a subspace of dimension p. Let Lyre Ze be a basis of $P. Then 


let X =z A+ A 2, which is a vector with £ = (>) components and which lies in 


ky * K*. The components of X are called the “Grassman coordinates of Sp”. By 
the lemma, the Grassman coordinates are given up to a factor. Grassman coordinates 
of distinct p-dimensional subspaces are not proportional. Incidentally, the Grassman 
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coordinates in general do not fill all of Kẹ, i.e., not every p-vector is decomposable. A 
heuristic argument is that the p-dimensional subspaces of K” constitute a “manifold” 
with p(n —p) degrees of freedom, so that the Grassman coordinates should be a manifold 
of dimension p(n — p) + 1, and for most cases with 0 < p < n we have p(n — p) +1 < 
(|) = dim Kp. 

Now suppose that K = R or C. Make R” into a Euclidean space or introduce a 
Hermitian metric on C” with e e. = 6;; (i $i, j Í n). Thus in the Hermitian case, 


if z = (61,...,€n), Y =(m,--- 5 Mn), then zy = E171 + +--+ €nijn- Introduce a similar 


metric on Ký with 
1 if o=r, 
E E = ; = ber, 
OE 0 otherwise. 


LEMMA 5E. (Laplace identity.) Given z,,... FAF AIEE. in R” (or C”), we 
= — = =p 
have 


say. 


see g£ 
2y, zy, 


E S YA Ay D 


zy > 2y 
=P =1 =P =p 
Here the inner product of the left-hand side is in Rẹ (or C} ), but each inner product 
on the right-hand side is in R” (or C”). 


Exercise 5a. Prove Lemma 5E, using linearity. 
A consequence of the Laplace identity is that 
1/2 
EA Y ove EA 2, 
Iz, OSETE = 


=p =1 =p =p 


As we have seen, this is the volume of the p-dimensional parallelepiped spanned by 


REEE E a 
1? {=p 


LEMMA 5F. For any z,- FAF RIIETE 
= ar =. =q 
(i) lz A Ag, Ay A Ay is Iz, Aag lly Av--Ay |, where equality holds *** 
= = = =q = = > =q 
ifz y =0(1 Si Sp,1 Sj £9). 


=) 


(ii) For any u,,.++ sup Zee EA andy Ys 


ju Aee Aueh u A Au Az, A Ag Ay AAY | 


Av Au AZ, A Az lu Av Au Ay Ave Ay |. 


***There are other cases of equality, e.g. when ZA Ag 
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There is a geometric interpretation to (i): The volume of the (p + q)-dimensional 


parallelepiped spanned by z,,... , Zp% rere od does not exceed the product of the vol- 
umes of the p-dimensional parallelepiped spanned by Eyre Ey and the g—dimensional 


parallelepiped spanned by Yor This is well-known, e.g. when p = 2, q = 1. 
= =q 


Proof. The length of a wedge product is invariant under orthogonal (unitary) 
linear transformations in R” (or C") 

(i) is obvious if Lye Lyre y are linearly dependent. So we may assume that 
Zyrvr Zoro Y span a space S of dimension m = p+q. After a suitable orthogonal 
(or unitary) transformation, S consists of points (ĉ1,... see ,--.,0). So we may 
write g, = (€i1,--- €im,0,... ,0) and y = (fji, ey Oras O) GS 1,...,p j = 
1,...,q). Then ’ 


z A Ag Ay Av Ay = pı pp fom ,0,...,0 
q m1 Nip Nim 
qi "ap Ngm 


The length is the absolute value of this m x m determinant. g, A-- ‘Az, has coordinates 

which are the p x p determinants from (€;;) and Y, A---Ay_ has coordinates which 
= 

are the (q x q) determinants from (n;;). The first meee of (i) follows from Laplace’s 


expansion of the m x m determinant with respect to the first p rows and last q rows 
and from Cauchy’s inequality. (Laplace’s expansion has (” ) = ( 4) summands.) 


P 
For (ii), we can write z= =r, + x (i =1,... ,p) where the zi ’s are spanned by the 
u’s and the z; ’s are orthogonal to the u’s (i = 1,... ,p). Similarly, y y, = a + y, ‘Gs 
1,...,q). The exterior products in (ii) are unchanged if we replace z, ’s with a ’s and 


y ’s with y'’s. So we may suppose, without loss of generality, that zu = 0y u = 0 
Zi ži TET 57 


(i =1,...,p; J =1,...,g; s= 1,...,£). Then, using the second assertion in (i), the 
left-hand side becomes 


lu Av Ag le Av Ag Ay A Ag |, 


the right-hand side becomes 


2 
ju, A Aul e A |, 


and (ii) follows. 

Let p,q be positive integers with p+q =n. Let K” be R” or C”, equipped with the 
usual Euclidean or Hermitian metric. Let SP, T?! be subspaces of respective dimensions 
p,q which are orthogonal complements of each other. 
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LEMMA 5G. If {€,} is a Grassman coordinate vector for SP, then {n,} with 
Nr = e(7, ree 


is a Grassman coordinate vector for T9. (Here 7 denotes the complement but € the 
complex conjugate). 


Example. Let p = 2, q = 1. Suppose S? is spanned by (213,212,713) and 

(221, £22,723). A set of Grassman coordinates will be (€12, £13, €23) = sty ie ; 
21 T22 

T11 T13 Ti? T13 

£21 T23|’|£22 T23 

coordinates are (£23, —£13, £12). So the lemma holds. 


But T? is spanned by (£3, —f13, £23), so that its Grassman 


Proof.{ Let Lye Ey be a basis for S? and Yeop a basis for T%. Write 
FE =q 
XSAN zy Y= Y a -A y Discarding the 1 notation in the statement of the 


X= eRe: Y= 0 mE, 
ECc( 


(n,p) TEC(n,p) 
By Lemma 5F we have re AY| = |¥||¥|. We obtain 


Lemma, write 


IXPIX)? = |XA¥P 


E,\E, 


> (e(F, 7 )és)nr 


TEC(n,g) 

S| > erji dD eF 
TEC(n,4) rEC(n,g) 

= |X PIXI. 


Hence, since the Cauchy-Schwartz inequality is an equality, the vector {7,} must be a 
multiple of the vector {e(7,7)&}. The Lemma follows. 

Suppose S? is a rational subspace of R” and A = A(S?) is the lattice of integer 
points in SP. Let Zoot, be a basis for A. Then 


X= NeR 
= = =p 


{This proof was suggested by Prof. L. Baggett. 
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will be a set of Grassman coordinates for S*. 
LEMMA 5H. The components of X are relatively prime. 


Proof. Suppose the components of X have a common prime factor l. Then the 
matrix 


T11 Tin 
Tpl eee Lyn 
modulo £ is singular. So there exist integers c1,...,cp, not all congruent to 0 mod £, 


such that cig, + +++ + c2, = 0 (mod £). Without loss of generality, cı # 0 (mod £). 
Then take g) = AGEA +- +cpz ) € A. But gf is not a linear combination with 
integer coefficients of z,... ER Contradiction, since the z.’s form a basis. 


COROLLARY 5I. H(S?) = |X|, where X is a Grassman coordinate vector with 
coprime components of the rational subspace S?. 


COROLLARY 5J. When S is a rational] subspace, its orthogonal complement 
SŁ has 
H(S+) = H(S). 


Proof. Let X be a Grassman coordinate vector of S with coprime integer compo- 


nents. By Lemma 5G, S+ has a Grassman coordinate vector whose components are the 
same as those of X, except for signs and ordering. 


$6. Absolute Values. 


An absolute value is a map a+— |a| from a field K to the reals such that 
ja] 2.0, and |a|=0 precisely when a= 0, 
lab] = |a/|d}, 
ja +b] S$ ja} + [6]. 
The last relation is called triangle inequality. When K = Q, the standard absolute 
value is an absolute value in our sense. In order to avoid confusion with other possible 


absolute values, we will denote it by Ja]... For any prime p, we can write any nonzero 
a € Q as a = p%(u/v) where p | uv. The p-adic absolute value is defined by 


p° if a0, 
lalp = r 
0 if a=0. 


Then |a|, satisfies all the axioms of an absolute value. In fact we have |a+6|, £ max(|alp, 
|b|,), which is stronger than the triangle inequality. An absolute value with 


ja + 6| £ max(|al, lbi) (6.1) 
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is called non-Archimedean. Absolute values which don’t satisfy (6.1) are called 
Archimedean. 

Let M(Q) = {00, 2,3,5,...}. For v € M(Q), the absolute value | |, is Archimedean 
precisely when v = oo. If a € Q is nonzero, we have the product formula 


[[ lel =1. 


vEM(Q) 


LEMMA 6A. Let |...| be an absolute value on a field K of characteristic zero. 
This absolute value is non-Archimedean if and only if |n| £ 1 for every n € Z. 


Proof. |1| = [1||1|, so that |1| = 1. Also |1| = |—1{]-—1], so that |— 1| = 1. In the 
case of a non-Archimedean absolute value one now proves easily by induction on n > 0 
that |n| S$ 1, and therefore also | — n} = |n| $ 1. 

Conversely, suppose that |n| Ê 1 for n € Z. We have 


Vv 


(a+b = > () ab i, 


1=0 
therefore 
v v 
a+ $ E) jerr S D laib Sw +N 
i=0 i=0 


where N = max({al, |b|). Taking v-th roots we get 


la+b| £ W+iN, 


and letting v — 00 we obtain |a + b| < N = max(|al, |bl). 
The absolute value with |a| = 1 for every a Æ 0 in K is called trivial. 


THEOREM 6B. (Ostrowski (1935)) The non-trivial absolute values on Q are 
given by 
jla] = la|} where 0<p £1 
and 
la| = |a|; where o >0. 


Proof. One proves by induction on positive integers n > 0 that |n| S$ n, so that 
also |—n| Ê n, and 
In| $ |noo. 
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Let. a,b € Z with a > 1, b > 1. For v > 0, we can write b” = co + cya +-+: + Cna” 
with 0 S$ c; < a and cn #0. Then 


|b|” = |b”] $ [eo] + [eilla] +-+: + lenllal” 
(n+1)aM” 


( uf a) aM”, 
log a 


where M = max(1, |a|). Taking v-th roots gives 


HA 


IA 


èl <» 1+ vlogb ya M'°8 4/108 2, 
~ loga 


And letting v — 00 gives 
jèl < M's b/loga 
That is, 
|b} £ max(1, Jafles 5/08 2), (6.2) 


Case I. | | Archimedean. By Lemma 6A, there exists an integer b with |b] > 1. 
Then by (6.2) Ja] > 1 for any a € Z with a > 1. So |a| > 1 for any a € Z with 
a ¢ {—1,0,—1}. For a,b € Z and a,b > 1, we have from the above that 


|b} $ max(1, Jall#>/°8*) 
i jal!°s b/ log a 


Then 
jbj1/ 108 b < |a|!/!o8 2, 


By symmetry, we get equality, i.e. 
|b|1/98 b zZ Ja|!/Ie8 a (6.3) 


We have 1 < |b] S [blo = b. So |b| = b? with 0 < p S$ 1 and then |a| = a? = |alé, 
by (6.3). Then for any rational r, we get |r| = |r|&. 


Case II. | | non-Archimedean. We have |n| < 1 for every n € Z. Since | | is 
non-trivial, there exists a € Z with |a| < 1. Let I = {a € Z : |a| < 1}. Then J is an 
ideal in Z. If |ab| < 1, then |a| < 1 or |b| < 1 since |ab| = |a||b|. So I is a prime ideal, 
say I = (p). 

Now let r € Q with r #0. Write r = p’z/y with z,y € Zand p | zy. Thenz,y ¢ I 
so |z| = |y] = 1 and 

Ir| = |p"| = lpļ”- 
Since p € I, we have |p| < 1 so 


lpl=p 7 with o>0. 
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Then 
[ri = |p” = p7” = (pY = Ir, 
with o > 0. 

We now turn to algebraic number fields. With each algebraic number field K there 
is associated a set M(K) along with certain absolute values |a|, where v € M(K). We 
have | |» Æ | |w if v Æ v’ and the following: 

(i) M(Q) = {00, 2,3,5,... }. 
(ii) For any a € K, a £0, we have |a|, = 1 for all but finitely many v € M(K). 
(iii) With every v is associated a natural number n, such that for a # 0 in K we have 
the product formula 
Il jalRe = 1. 


vEM(K) 
For v € M(Q), n, =1. 
(iv) If K' C K and v € M(K), then there is a v' € M(K) such that |a|, restricted to 
a € K' equals |a|y. This v’ is unique and ny | n,. We write v | v’. 
(v) If K' C K and v' € M(K'), then there are finitely many v € M(K) with v | v' and 


In particular, by (iii), (v), given v’ € M(Q) we have 


D ns =[K: Q] 
ee 
(vi) If K is an algebraic extension of Q with ri real embeddings mapping a € K respec- 
tively into a),... ,a(") and rz pairs of complex conjugate embeddings mapping 
a into 
anto, alı+1), D ali tr), al"ı+r2) 


where rı + 2r2 = [K : Q], then the absolute values dividing oo are 


Ja) ],..., fa], Jat]... , fale tr2)], 


The first rı of these have n, = 1 and the last r2 have n, = 2. 

(vii) Let ø : K — L be an isomorphism. If v € M(L), for a € K put fal, = |oaly. 
Then w € M(K), and this gives a one-to-one map M(L) > M(K), and in this 
correspondence ny = ny. 

We listed these properties in an axiomatic way but don’t intend to prove them. 
They can be found in any treatment of algebraic number fields based on absolute values. 
For readers more familiar with classical ideal theory we now sketch the connection with 
ideals. 

With any algebraic number field K there is associated a ring © of integers in K. 
Any nonzero ideal X C O can uniquely be written as a product of prime ideals, i.e., 
A = Pi --- Pz’ with nonnegative integers a),... ,ag. In particular, any principal ideal 
(p) can be factored in this manner, (p) = $j’ --- Pg, where P; | p. We can define 
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norms of these prime ideals by N(P;) = pë where card(O/$;) = p^. A fractional 
nonzero ideal 2% can also uniquely be written as A = PI ---P7* with aj,... „az in Z. 

Non-Archimedean absolute values | |, with v € M(K) are in one-to-one corre- 
spondence with prime ideals. That is, jalp = p~”/* if (a) = R'A where P | p 
and (p) = PBI- -- P7, and where P does not occur in the factorization of A. So 
lplp = p7} = |plp and | | extends | |p. We also have ng = ef where N(P) = pf. 


Example: Let K = Q(/2). Then 7 = (3 + V2)(3 — V2) in K. By (vi), there are 
two absolute values dividing oo. Say vı | co and v2 | oo. Then n,, = ne, = 1. And we 
have [3+ V2lo, = 3+ 72, [3—V2l», = 3- v2, [3+ V2, = 3- V2, 13- Vl», = 3+ V2. 
Now look at the absolute values dividing 7. Pı = (3 + V2), P2 = (3 — V2) divide 
7. If wi, we are the absolute values associated with P1, P2 we have ne, = nw, = 1 
and |3 + V2|u, = 771, |3— V2lv, = 1, [8+ Valu, = 1, [3 — Valu, = 77). For 
u £ vi, V2, W1,We2, |3 + V2\u = 1, so the product formula holds for a = 3 + V2 and 
a=3-— V2. 


Remark. Ifv | œœ, then | |, is Archimedean. If v | 00, then | |, is non-Archimedean. 
This follows from Lemma 6A. 


§7. Heights in Number Fields. 
Consider a = (a1,...,@n) € K”. Define 


al (lail? +--+ lanl?) if vis Archimedean 
al, = 
z max(jai|y,...,/@nlv) if vis non-Archimedean. 
If a # 0, then |a|, = 1 for all but finitely many v. For a # 0, define the height (or “field 
height”) of a by 
Hkla)= [| lal. 
vEM(K) 


If, e.g., a, #0, then jals 2 |ai|, for each v, which implies Hg(a) 2 1. Also |Ag|, = 
Mlelælv, so Hg(àa) = Hx(q) for À £ 0 by the product formula. 


Example. Take K = Q. Let z = (z1,... ,£ņp) be a primitive integer point. We 
have |z| = |z| (the usual Euclidean norm) and |z|p = max(|zi|p,--- ,{¢nlp) = 1 for 
every p # 00, since g is primitive. So H(z) = |z]. 


Example. Take K = Q(V2). Let a = (1,3+v2). We have lal = 1/1? + (3 + V2)? 
= V6V2+ V2, lal, = V6 V2 - v2, Jolw: = lalw, = 1, in fact jal, = 1 for u non- 
Archimedean in M(K). So Hx(a) = lav, lælv, = 6 V2. 
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Suppose K C K and a€K",a#0. Then how do Hx(q) and Hg(a) compare? 


Hy(@)= JI w= J I] let. 


#E€M(K) ve€M(K) sem 
And (v) of section 6 gives 
Hg) = [[ lal 
vEM(K)}) 
= (Hx (a) "1. 
The absolute height H(a) is defined by 
H(q) = Hx(ay ea, 
Then H(a) does not depend on the field K. 


Remark. If K & L and o : K —> L an isomorphism, a € K”, oa € L”, then by 


(vii) Hx(a) = Hr (ca) and H(a) = H(ca). So conjugate vectors have the same height. 
Exercise 7a. Is it possible to estimate H(a + £) in terms of H(a) and H(£)? You 
would need to supppose a #0, 8 #0, a+ 8 #0. T E 
If P(X) = anX"+---+a1X +a with a; € K is a nonzero polynomial, define the 
height by 
Hx(P) = Hx(an,--- ,&1, a0). 


We can define H(P), the absolute height of a polynomial, in a similar fashion. 


LEMMA 7A. 
H(PQ) < V/n+1H(P)H(Q) 
when deg P = n, ( or n = min(deg P, deg Q)). 
Proof. Write P = anX" +--+ ao. Associate P with a = (an,... ,a9). Define 
[Plo = |a|,. Then 
Hx(P)= J] iP. 
v€M(K) 


Suppose v is non-Archimedean. Then 


IPQ|. = |PlolQlo. 


This is essentially Gauss’ Lemma. We leave its proof as Exercise 7b. Now write 
Q = BnX™ +--+ Bo and PQ = YntmX"t™ +--+ +0, where yi = Patozi Aapo- Ifv 
is Archimedean, then 


2 


5 aafo 


a+b=i 


PR = Dota = 
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Cauchy’s inequality implies that 


N N 
Dal ENY Iel. 
k=1 k=1 
Using this inequality we obtain 
n+m 
IPQR S Y n+1) SO leah? 
= oS. Ss 


(n+1) Y. laal? >> sl? 
a b 
= (n + D)|PRIQI- 


So 
IPQ|. $ Vn +1|PllQlv 


if v is Archimedean. Then 
Hx(PQ) $ (n +1) A Hk(P)HK(Q) 
since )>,),.% = [K : Q]. And 
H(PQ) $ (n + 1)” H(P)H(Q). 


For a € K! = K we have Hx(a) = Hx(1) = 1. This doesn’t tell us much about 
a, so we define 
hx(a@) = Hx(1,a@) 


and 
h(a) = H(1, a). 
Then 
hx(a)= [I (Gai 
vEM(K) 
= I (vi + fez)" [[ Gmax(1, Jalo))”* 
veM(K) vEM(K) 
v Archimedean v non-Archimedean 


Remark. We have h(1/a) = H(1,1/a) = H(a,1) = h(a). 


Remark. The reader should be warned that other authors often use another height, 
with the maximum norm for both Archimedean and non- Archimedean absolute values. 
This is true, e.g., of Bombieri and Van der Poorten (1987) or Mueller and Schmidt 
(1989). But Bombieri and Vaaler (1983) use the same norm as in these Notes. 
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Example. Let a/b € Q be in lowest terms. Then he(a/b) = Ho(1,a/b) = 
Ho(6,a) = Va? + 6. 


Exercise 7c. Estimate h(a) and h(a + £) in terms of h(a), h(f). 


LEMMA 7B. If P(X) = aq(X—1)---(X—7ya) where aa,71,-... , Ya are algebraic, 
then 
5-4? h(y)---h(ya) $ H(P) $ 2 VP A(n)--- h(a). 


Remark. Notice that the leading coefficient œg doesn’t enter into the inequalities 
since the height of a polynomial is independent of multiplication by a constant factor. 


Proof. The upper limit follows on applying Lemma 7A, d — 1 times. That gives 
H(P) $ 2¢-YP A(X — y1) H(X — ya). 


Notice that H(X — y) = H(1,-7) = H(1,7) = h(7), so the upper bound is proven. 
For the lower bound, we may suppose that ag = 1. If v is non-Archimedean, then 


d 
[[ max, hile) = [Plo 


i=1 


by Gauss’ Lemma. If v is Archimedean, then we claim that 


d 
II Vit hil <5" |P |. 


i=1 


(This claim will be proven below.) If K is a number field of degree n containing all of 


the y’s, then 
d 


[[ hx(i) < 5°? Hx(P). 


Taking n-th roots gives the result 


d 
[[ 20a) < 547 H(P). 


i=1 


The proof{ of the claim is by induction on d. The case d = 1 is trivial. For v 
Archimedean, we may think of | |, as the usual absolute value on C. Without loss of 
generality, we have |y1| £ --- <Š |yal, and we may suppose that |yq] 2 2. (Otherwise 
the result is clear.) Write 


P(X) = X? + aa- X4! +-+ a0 
= Q(X)(X — ya) 


tI am indebted to Prof. Halberstam for simplifying my original proof. 
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where Q(X) = X47! + Ba-1X 47? +--+ + Bo. Then 
a; = Bi-1 — Yai, i = 0,1,...,d— 1, 
with 
B-y = 0, Ba-1 =1. 
Writing c = |yal, we get 
lail? > (elBi|? — [Bi—-1|)? = c7 |i? + Gia? — 2clBi-1 Bi 
> PJB? + li-l? — e( Bi)? + ti-l?) 
= (c? — )|Bi|? —(e-1)|4i-a?’, 


and, summing over t, 


d-1 d—2 d-2 
1+) la? >14+e?-—c+(c? —c) >> |Bil? —(e-1) >, IAP? 


i=0 i=0 i=0 
d-2 
=(c— 1? >> |i? +e —-c+1 
Ts d—2 
=(e-1){1+ J |Bil?} +c, 
i=0 
so that 
[Ply > (c - 1)Qlv 
and 


1 
v < — |Pļ]e. 
IQI» < == IP] 


By induction, 
d-1 
LEO + rk)” < 50-A 
i=1 
so that ; 
[]@+ hit? <a e?) P], 
i=1 


$ 5% |Ple, 
since c 2 2. 
LEMMA 7C. Given n,d and B, there are only finitely many non-zero vectors (@ = 
(a1,...,@n) with each a; of degree Ê d and H(a) Ê B, if you consider proportional 


vectors the same. 


Proof. Without loss of generality, consider vectors (1,a2,...,@n). Then 


Hx(1,a2,...,@n) 2 Ax(1,0i)=Ax(ai) (¢=2,...,n), 
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since |(1,a2,...,@n)|v 2 |(1,a;)|». Thus it suffices to show there are only finitely many 
a of given degree d with h(a) £ B. Such a would satisfy a polynomial equation P(a) = 0 
where 


P(X) =(X — aX — a)... (X — al) 
with a@),... ,a!® the conjugates of a. Here P has rational coefficients. Then 
H(P) $ 2¢-YP ra). h(a) 
< 9(4-1)/2 pa 
since h(a) = --- = h(a) and h(a) $ B. There are only finitely many such poly- 
nomials P. Because suppose P(X) = X°? + pa-1X%7! +---+ po where pi € Q (i = 


0,...,d—1). If H(P) = H(1,pa-1,-.. ,p0) $ 20¢-)/?B4, then H(1,p;) < 264-Y/?Be 
(2 =0,... ,d—1), and clearly there are only finitely many such rational po,... , pa—1. 


Open Problem. Given n,d, and B, find an asymptotic formula for the number 
of vectors a satisfying the hypothesis of Lemma 7C. 
For a € K and v € M(K), we define 


(ajs = laly”. 
Then the product formula becomes 
TI = 
vEM(K) 
fora #0in K. 


LEMMA 7D. If a; Æ a2 in K, then for any v € M(K), 


1 


_ = = ny > ——— 
(a1 — az)» = jai — azl = hx(a1)hK(a2) 


Proof. If v is non-Archimedean, then 


max(|a1|», |@2|») 
(max(1, |ai|»))(max(1, |a2|»)). 


lay — aly 


IA WA 


If v is Archimedean, then 


lai — aly = V1+ lai)? V 1 + |a2|?, 
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which follows easily by considering a,,a@2 as complex numbers and | |, as the usual 
absolute value on C. Then the product formula gives 


t= [J la- oli 


we€M(K) 
=la -œl J] loi- ool 
wEM(K) 
wu 
= oxy z azp” Il (1, ar Jlo” 1(1, 2) /2” 
wEM(K) 


= ay — alp" hkr(ai)kglaz), 


and the result follows. 


§8. Heights of Subspaces. 


a K be an algebraic number field and S? a subspace of dimension p Tor by 
Zs p Then the Grassman coordinates of SP are given by X X= ZLA Az, We 


define tn field height of S? by 
Hx(S?) = Hx(X) 


and the absolute height by 
H(S?) = H(X). 


When p = 0, so that S° = {0}, put Hx(S°) = H(S°) = 1. {From Lemma 5G, we have 
for 0 < p < n that 
H(S?) = H((S?)*). 


Since Kf is a 1-dimensional vector space, it is clear that H(K") = 1. Therefore the 
above relation holds for p = 0 or p= n also. 


LEMMA 8A. If S,T C K” are subspaces and S$ +T is the subspace of vectors 
s +t where s € S, t E€ T, then 


H(SAT)H(S +T) $ H(S)H(T). 


Proof. Let U = SANT. Pick a basis U,,-++,u, of U. Then we can choose a basis 
Wires otp Zire Ey for S and a basis Wyre Mes Wore Y for T. All we need to 


show is that 


ju A--Aujolu, A Au Ag AAR Ay Av Ay de 


S |u Av Au, AZ, Av Az lolt Up Ay Av Ay le 


for every v € M(K). We have already proven the Archimedean case. (See Lemma 5F.) 
The non-Archimedean case is left to the reader. 
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Exercise 8a. Prove the inequality above for v non-Archimedean. (When £ = 0, 
the above inequality is to be interpreted as |z, A++ Ag A Y A---Ay |» & Iz, Acc A 
Z Z, ^Y EA z 
z,loly, A= Ag fo 
For B € Z with B > 0, let N(K,n,p,B) denote the number of p-dimensional 
subspaces S? of K” with Hx(S?) £ B. Then N(K,n,p, B) is bounded. W. M. Schmidt 
showed (1967) that 
N(K,n,p, B) >< B” (8.1) 


where the notation >< means there exist constants C, C” such that 
C'(K,n,p)B" $ N(K,n,p,B) $ C(K,n,p)B". 


Notice that the constants C, C' may depend on K,n, and p. Schmidt (1968) also showed 
that in the special case K = Q, we have the asymptotic relationship 


N(Q, n, P, B) on co(n, p)B” 
for 0 < p < n. Here 


E eee (") Vin) Vin -1)--V(n-p+1) COX) Cp) 
mn \p VOAN) VE) an1) C(n—p +1) 


where V(n) denotes the volume of the unit ball in R”. Notice that co(n, p) = co(n, n — 
p). In general, (i.e., for K an arbitrary number field), an asymptotic formula was 
recently derived by J. Thunder (to appear). Schanuel (1979) has given such a formula 
for N(K,n,1, B), but for different heights. 


LEMMA 8B. Let K be an algebraic number field of degree d and 6 # 0 an 
algebraic number which is not necessarily in K. Given B 2 1, the number of a’s in K 
satisfying 

h(ad) $B 
is bounded by 
36(2B)??. 


A similar result is due to Evertse (1984). 
Remark. Consider the case 0 = 1. For a € K, the condition h(a) S B can be 
written as Hx(1,a) £ B? since (h(a))? = hx(a) = Hx(1,a). Lemma 8B agrees with 


(8.1), according to which the number of pairs (1,a) satisfying the last inequality has 
order of magnitude B??, 


Proof. For a € K, we will use the notation (a), = |a|}" and the corresponding 
product formula 
Il (ajs =l. 


vEM(K) 
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For $ € K” we will let (8), = ||}- so that 
Hx(@)= TI Ø 
veM(K) 
Let L be a field containing K and @. For v € M(K), we have 
(ajo = lal = JI (latge yen 


we M(L) 


wie 


since the exponent on the right-hand side is (see (v) of §6) 


omen 
vema) [L : K] 
In our new notation, 
(a= [J (a (8.2) 
we M(L) 


wiv 


Fix an Archimedean v € M(K). Assume, say, that v corresponds to a complex 
embedding of K into C. Consider 


hy(a6) = [I ((1, @8)) w 


weM(L) 
= hz; (ad)hzp2(ad) 
where 
hii(ad)= J] ((1,06))» and hra) = [| ((1,08))yy 
ZEGO ven 
Then 


h(a) = hy(a8)ho(a6) 
where h;(a6) = hri(a0) [E (i = 1,2). We will initially estimate the number of a’s 


in K satisfying hi(a0) Ŝ Bı and ho(a6) S$ B2. 
Take such an a. Then 


(a)o= [J (a) (by (8.2) 
we M(L) 
wiv 


= J] af uy 


went) 
where 
M= [I (ihe. 


weEM(L) 


wi 
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Then 
(a)y $ hra (a0) EK re 
= (h(a6)/My)* 
< (B,/M,)* 


where the first inequality follows because (a6), S ((1,a6)), for each w. The assump- 
tion that v corresponds to a complex embedding gives 


lal, Ś (Bi/My)*? 
since (a), = |a|?. So a lies in a square centered at the origin with sides of length 
2(B,/M,)4/?. 


Suppose there are N elements a satisfying the inequalities hi(a@) £ Bı, h2(aé@) 
< By. Assume that N 2 2. Pick the positive integer t satisfying 


<N S(t41). 


Divide the original square into t? squares of equal size. Since N > t?, there exist a} £ a2 
satisfying the two inequalities and lying in the same subsquare. Then 


2/2 
Jar — aap S = (B: /M,)*} 


3 
< 7(B1/M,)*” 


and 
(a1 — a2)» < 5 (Bi/M.)4 
36 (8.3) 
= y (B/M). 


In the case where v corresponds to a real embedding, the same bound can be derived. 
Since 6(œı — a2) is non-zero, the product formula holds. Consider first the factor 


[I (Oula - a)y = MEA (ay — a2) L241 


wEM(L) 
wiv 


< 364-41 pii NALA (by (8.3)) 
= (36B NT»), 


The remaining factor is 


I (Bay — 0a2)w. 


weEM(L) 
wiv 


Using the fact that 


max(1,|@a;|)max(1,|@a2|,) if w is non-Archimedean, 


y1 + [fa]? 1+ |@a2|2, if wis Archimedean, 


lðai — Oa2|y Í 
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we have 
[| (cn —@a2)u $ TI ((2,6011))e0((15 8a2))w 
SEME) SENG) 
Í hr2(8a1)hr2(6a2) 
= (h2(Oa} Jha(8az) Tt 
(BPm, 


Then the product formula gives 
1 $ (36- Bi B3 NT) 


and 
N Í 36. B$B?’. (8.4) 


This is trivially true when N < 2. Recall that N was the number of a € K with 
hı(að) < Bi, h2(a6) < B2. 

We now complete the proof of the lemma as follows. Given a with h(a@) = 
hi(ad)ho(a@6) Š B, let k be the integer with 


2*-1 < hy(ad) < 2. 
Then k 2 1 and 2*-! < B so that 
ho(ad) $ B-2)-*, 
Given k, the number of corresponding a € K is (by (8.4) with By = 2*, By = B-2!-*) 
< 36- okd p2do2d—2kd 
= 36. 274 B72-*, 
Summing over all k 2 1, the total number of a’s is 


< 36-274 B24, 


§9. Another Version of Siegel’s Lemma. 


Let K be a number field of degree d with embeddings o1,...,o04¢ into C. Given 
Dyer Qe in K”, let S be the subspace of K” which they span. We have al), fae al) 
in K® (i = 1,...,d), where ø; is the isomorphism K > K. Let K denote the 
compositum of K@,... , K(®, and Ê the subspace of K spanned by a), or a), pear 
ald ae If wm = dim S, then rk Í md. Let w;,... ,wg be a field basis for K/Q, 
so that each a, has the form 


a. = wiz. 
=) = 


ph hes, {A Sj Sm) 


33 
with Z€ Q” (1 $2 Š d). Then 


The matrix (w®) is non-singular. For fixed j, each vector Za (1 Í £< d) isa linear 


combination of A le a So S is spanned by Za (1 £3 £m,1 £2 Sd), and 


5 is defined over Q (i.e., S$ is spanned by vectors with rational components). 
The orthogonal complement S+ is also defined over JQ and 


dim $t =n -m 2 n— md. 


Assume now that n > md. By Lemma 4B, there exists an integer point z # Q in St 
with 


zj < Hs ple 
< H(t)! e-ma) 
= H(A, 


Let S (1 S i Š d) denote the subspace spanned by OO) a0 al) and S again the 
space spanned by a,,...@. Then =S @---@ SM and 
H(S) $ H(S™).--H(S®) (by Lemma 8A) 
= H(S)* (by (vii) of §6) 
$ (Hla) H(a,))* (by Lemma 8A) 
We have proven the following version of Siegel’s Lemma: 


LEMMA 9A. Suppose K is a number field of degree d, the vectors Gyre in 
K” span a subspace S, and dm < n. Then there is a vector z € Z"\Q with 


and 


BE acess 
< (H(a,)  H(@ YCA. 


This particular formulation is due to Bombieri and Vaaler (1983). They assumed, 
however, that SO) @---@ SM had dimension md. 


Remark. In fact, there are n — dm linearly independent solutions z,...,Z) dm 
such that 


lele lz, ,| EO 
$ (Hla) H(a,,))*. 


More generally, Bombieri and Vaaler consider a number field k C K and a system 
of equations a z = 0 (2 =1,... ,m) where a, € K”, and they seek solutions z € k”. 


n—dm 


II. Diophantine Approximation 


References: Cassels (1957), Schmidt (1980). 


§1. Dirichlet’s Theorem and Liouville’s Theorem. 


THEOREM 1A. (Dirichlet, (1842)) Given a € R and N > 1, there exist integers 
z,y with Sy Ś N and 
lay — z| < 1/N. 


When a is irrational, there are infinitely many reduced fractions z/y with 


Remark. It is clear that the first statement remains true when we require that 
x,y be relatively prime. Then the second inequality follows by 


l1 <1 


Ny = y? 


T 
aac 
y 


Since a is irrational, ay — z is never zero, thus for fixed z, y the inequality |ay— z| < 1/N 
can be satisfied only for N under a fixed bound. Then as N — oo, we must get infinitely 
many distinct pairs z, y. 


THEOREM 1B. (Dirichlet) Given @1,...,&n in R and N > 1, there exist 
Y;T1,--.,2n inZ withl Sy Š N and 


laiy — zi| < NT (i=1,...,n). 


If at least one of the a1,...,Q, is irrational, then there are infinitely many n-—tuples 
ee , in) with gcd(y,21,...2n) = 1 and 
Ti 1 % 
Qim ĉan < Fa (@=1,...,n). 


Consider the system of inequalities: 


larity +++ + Qintn| < Ai 


(1.1) 


l@n—1,171 ieee m On—-1,nFn| < An-1 


[anizi +-+ AnnTn| = An 


where | det(a;;)| #0 and A; > 0 (i =1,... ,n). This system defines a parallelepiped of 
volume 


2"A,-°:An 


jäeta. oe) 
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Suppose A,---A, 2 |det(a,;)|. Then the volume of the parallelepiped is greater than 
or equal to 2”, and the result would follow by Minkowski’s Theorem (2C) of Chapter I 
if we had a compact set. However, we have 


LEMMA 1C. Suppose A; > 0 (i = 1,...,n) and A;---An 2 |det(a;;)| > 0 in 
(1.1). Then the system of inequalities has a solution x € Z"\0. 


Exercise la. Prove Lemma 1C. 


Proof (of Theorem 1B). In R"*?, consider the system of inequalities 


lary — 24| < N71" 


[any — Zn| < N1. 


ly} EN. 


By Lemma 1C, there is a non-trivial solution. If we had y = 0, then z1,... , £n would all 
be zero, too. Thus y # 0. Then there exists a solution with y > 0, therefore 1 Sy SN. 
The second assertion of Theorem 1B follows just like in Theorem 1A. 

Theorem 1A was improved by Hurwitz (1891). He showed, for a irrational, that 
there exist infinitely many fractions z/y with 


1 
< —. 
V5 y? 


See, e.g., Schmidt (1980) for a proof. The following Lemma can be used to show that 
the constant v5 is best possible. 


aS 
y 


LEMMA 1D. Suppose a is a real quadratic irrational satisfying aa? + ba + c = 0 
with a,b,c € Z, the leading coefficient a > 0 and discriminant D = b? — 4ac. Then for 
A > vD, there are only finitely many fractions z/y with 


Example. Consider the polynomial equation a? — a — 1 = 0. Here D = 5 and 
a =(1+¥5)/2. Using Lemma 1D, we see that for A > v5 there are only finitely many 
solutions to |æ — (x/y)| < 1/(Ay?). Thus Hurwitz’s result is best possible. 
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Proof. Writing f(X) = aX? + bX +c = a(X —a)(X — a’) gives D = œ? (a —a’)?. 
Then if |æ — (z/y)| < 1/(Ay”), we have 


1 < z 
Iy s(2) 
y? y 
(aa) 
=ja|[——a@ —--—a 
y y 
a i 
Ay? |” a+-—--@Q@ 
VD a 
Ay? | Ayt 


Subtracting VD/Ay? from both sides gives 


which becomes i 
2 


Cee as 


The result is proven. 
Given any quadratic irrationality a, there exists a c > 0 such that 


z| > c 
Q — — aa fepe 
y| ~ y? 
for any J #a. Any irrational æ satisfying 
Je-2| 2 <> 
y y 


for some constant c(a) > 0 is called badly approzimable. 


THEOREM 1E. (Liouville (1844).) Suppose a is algebraic of degree d. Then 
there exists a constant c(a) > 0 such that for any rational S # a, we have 
c(a) 
yi 


T 
a-- 


y 


Proof. The proof is broken into three steps which will be important in a more 
general context later on. 
(a) Let P(X) be the defining polynomial of a. So deg P = d, the coefficients of P are 
in Z (i.e., P(X) € Z[X]), and P(a) = 0. 


(b) For rational = # a, we have 
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(c) Expanding P into a Taylor series at a, we get 


PG- RG) Se 


i=1 


since P(a) = 0. We may assume that 


y 
(Otherwise, we’re done.) Then 
d i 
PO(a)| 
lx p (2) < l-2 E 
yi = y/| = y 2 i! 
Let c(a) be defined by 
y IPO 1 
il = Qe(a)’ 


then the result follows. 


COROLLARY 1F. (Liouville) The number a = Yg; 27” is transcendental. 
Liouville was first to exhibit transcendental numbers, in fact first to prove the 
existence of such numbers. 


Proof. Write y(k) = 2" and x(k) = 2" t127”. Then 2(k), y(k) € Z (k 2 1) 
and 


< 2-27(k+1)! 
< c/y(k)? 


for any given c,d, provided that k > ko(c,d). Hence, for any d, we have a not algebraic 
of degree d by Liouville’s Theorem (1E). 

The numbers which can be proved transcendental by Liouville’s Theorem are called 
“Liouville numbers”. They form a set of measure zero. This explains why Liouville’s 
Theorem is not enough to prove the transcendence of classical numbers such as e or 7. 


Exercise 1b. Given a € R and N > 0, there exist x,y € Z, not both 0, with 


Niay—2|+N7"Iy| $ V2. 


38 


Now use the arithmetic-geometric inequality to show that given a € R\Q, there are 
infinitely many rationals 4 with 


<=. 
2y? 


(This is better than Dirichlet’s Theorem, but worse than Hurwitz’s Theorem.) 


§2. Roth’s Theorem. 


A consequence of Liouville’s Theorem is the following: If dega =d 2 2 and p > d, 
then 

ī 1 
== < 

yl yë 
has only finitely many solutions =i Thue (1908) strengthened this result by weakening 
the hypothesis to u > (d/2) + 1. Siegel (1921) in his thesis improved this to p > 2 Vd. 
Dyson (1947) and Gelfond (1952) showed that the result holds for u > V2d. In 1956, 
Roth received a Field prize for his 1955 result with u > 2. Dirichlet’s Theorem shows 
that Roth’s result is best possible. 


THEOREM 2A. (Roth (1955).) Ifa is algebraic and 6 > 0, there are only finitely 
many rationals z with 


Remarks. 
(i) Roth’s result is correct but trivial for a € C\R. 
(ii) If dega = 2, then Lemma 1D is better. 
(iii) We know that there are infinitely many $ with 


T 1 
&— y < y?’ 
and only finitely many ; with 
z 1 
oa ye 


with 6 > 0. For any given a with deg a 2 3, it is still unknown whether a is badly 
approximable, i.e. whether there exists a c > 0 so that 


for every rational re The conjecture is that this holds for no algebraic a of degree 
23: 
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(iv) Another conjecture is that Roth’s Theorem holds in the following strengthened 
form: the inequality 
i. | eer cee 
y| ` y?(log y)* 
has only finitely many solutions for k > 1. 
The following theorem gives heuristic grounds for the conjectures in (ili) and (iv). 


THEOREM 2B. (Khintchine (1926).) Suppose %(y) > 0 is defined on the positive 
integers and wy is nonincreasing. Consider the inequality 


x 


< vy) (2.1) 
Y 


y 


If 
(i) Dee p(y) < œ, pn (2.1) has only finitely many solutions for almost all a. 
(ii) Dai p(y) = œ, then (2.1) has infinitely many solutions for almost all a. 


Remarks. Take (y) = 1/y(logy)* with k > 1. Then case (i) in Kintchine’s 
Theorem says that 
adya 

y| ` y?(log y)* 
has only finitely many solutions for almost all a. Taking p(y) = 1/y log y, case (ii) tells 
us that 


1 
- y? logy 


T 
&— — 


y 
has infinitely many solutions for almost all a. 

Here we will only prove the easy part (i) of Khintchine’s Theorem. The inequality 
(2.1) defines an interval for a of length 2y¥(y)/y. The union of these intervals for 
z = 1,2,... ,y has measure Í 2y(y). The union of the intervals (2.1) with z € Z is a 
set which is invariant under translations by integers, and the intersection of this set with 
0 S$ a<1has measure S 2y(y). Thus if S(y) is the set of numbers a in0 Sa<1 


for which (2.1) holds for some z, then S(y) has measure pu(S(y)) S 2p(y). Further if 


SN = U S(y); 


y=N 


then (Sy) — 0 since See p(y) is convergent. Now a in 0 Ê a < 1 has infinitely 
many solutions to (2.1) precisely when a lies in 


co 

N Sy; 

N=1 
but this set has measure 0. 


Outline of Proof of Roth’s Theorem. We might try the following. 
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(a) Pick a polynomial P(X) € Z[X] which is not identically zero, vanishes at a of order 
q, and has degree r. 


(b) Show that P (2) # 0 with only finitely many exceptions Z. Then 


>i 
= y 


P(E) -E (2-2) SO 


i=q 
Then if |= — al < 1, we have 
POl 
y y 
for some constant C (a), so that 
z E OU 
y| ya 


with C’ constant. 

This is good if r/q is small. But if dega = d, then r 2 qd. Thus r/q 2 d, and we 
get no improvement over Liouville’s Theorem. 

This argument can be modified by using a polynomial in m variables. Thue used 
a polynomial in 2 variables of the form P(X1, X2) = X2Q(X1) — P(X1). Siegel used a 
more general polynomial in two variables, and so did Dyson and Gelfond. Only Roth 
was able to overcome the difficulties involved in dealing with more than 2 variables. 

To see why a polynomial in m variables offers an advantage, consider P(X,,... , Xm) 
€ Z[Xj,... , Xm] of degree at most r in each variable. Such a polynomial is made up 
of monomials xt +++ Xim with 0 S$ i3,...,im Í r. The numbers of such monomials is 
(r +1)™, so the number of possible coefficients is also (r +1)” ~ r™ as m > ov. 

Try to make P vanish at (a,... ,@) of order q. Then 


Prd a, ,a) =0 


for j1+---+jm Êq. The number of implied linear homogeneous equations with rational 
coefficients for the coefficients of P does not exceed 


a(**™) ee a (2.2) 


as q — œ. Roughly speaking, we can choose r,g with 
r™ ~ dg™/m!, 


tOne might be tempted to try a polynomial of bounded total degree, but difficulties in part (b) 
preclude such an approach. 
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r/q~ V/d/m!. 


Use the multidimensional Taylor’s formula to write 


Llar Ca 


where the sum is over j1,...,jm With j1 +---+Jjm 2 q. Then if things go well, in 
particular when P (z, nhs z) Æ 0, we have 


p(é,...,2) 
yY y 


1 
ayp 


< 


and 


with rm/q © cm: Yd, where 
Cm =m] Vm!. (2.3) 


So we can hope to improve Liouville’s result to 
H> Cmd", 


and this will actually be achieved in Theorem 6A below. In order to get Roth’s Theo- 
rem, one further has to show that the number of conditions imposed on the auxiliary 
polynomial P is often less than (2.2). 

The difficulty with this approach is in step (b). The zero-set of P(X1,...,Xm) 


is some algebraic manifold in R™, so that it is hard to show that P (2... i z) # 0. 


To overcome this difficulty, one considers instead an m-tuple re. ee a of distinct 
rational approximations, and tries to show that P (2, ee in) # 0. It turns out that 
one needs yı < y2 <--> < Ym increasing rapidly. 

In order to make this approach work, one needs |a — mal all small (i = 1,...m). For 


example, in the case m = 2, one needs two good approximations mre PA with y2 much 
larger than yı. This is why just one very good approximation gives no contradiction, 
and the result is “ineffective” in the sense that no bound can be stated for the size of 
the numerators y of very good approximations. 

Effective improvements of Liouville’s Theorem for certain cubic irrationals were 
given by A. Baker (1964). Then Feldman (1971) used Baker’s theory of linear forms of 
logarithms to give improvements for general algebraic a. These were of the type 


where c(a) > 0 and c;(a) > 0 are effective. Unfortunately, c;(@) so obtained is usually 


very small. Then further improvements for special numbers were obtained by Baker and 
Stewart (1988), Bombieri (1982), Bombieri and Mueller (1983), Chudnovsky (1983). 
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§3. Construction of a Polynomial. 


We will follow Bombieri and Van der Poorten (1987). We will construct a polyno- 
mial P(X1,...Xm) € Z[Xi,... , Xm]. For any such polynomial P, define |P| to be the 
maximum absolute value of its coefficients. If J = (21,... ,îm), then let 


; 1 Out tin p 


Then P! € Z[Xj,... ,Xm] for P € Z[X1,...,Xm]. If P has degree Š r; in the variable 
_X; (i =1,...,m), then it is easily seen that 
[PH Soe? tee [P| = ie] (3.1) 
with r = r1 +--++rm. One similarly shows that 
KP] £ 39 4-Fe [P| = 3" PL. (3.2) 


We will say that P is of multidegree Š R = (ri,...,rm) if its degree in X; is Ê r; 
(t=1,...,m). 

Let æ = (a1,... ,@m) with real or complex coordinates and E = (e1,... , €m) with 
natural coordinates be given. If P(X1,...,Xm) # 0, the indez of P at a with respect 
to E is defined to be the largest value of t such that > 


P'(a) =0 
for every I = (71,...im) with 


i pae p ER 


€l em 


The index of the zero polynomial is understood to be oo. (This definition is due to 
Roth.) 


Remarks. If e1 = --- = em = 1, then the index is simply the order of vanishing of 
P at a. The e;’s allow different weights to be given to the different variables. 
Given m-tuples I = (11,... ,im) and E =(e1,...€m), let 


Let © (t, Ẹ) denote the set of (&,... , £m) satisfying 


CStSi (i =1,...,m) 


and 2 y 
i+ +im— St 
ej Cm 
Then : , 
2 peti 
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I R 

R EG (+ z) 7 (3.3) 
The index of P with respect to E is the largest value of t such that P!(a) = 0 for every 
I satisfying (3.3). 


if and only if 


Remark. In our applications, R and E will both be the multidegree of P. 
Let W (t, 2) denote the volume of G (t, 2), let G(t) = G (t, 2), and W(t) = 
W (t, 2) the volume of G(t). 


LEMMA 3A. Suppose ai,... ,@m lie in an algebraic number field K of degree d. 
Suppose t > 0 and dW(t) < 1. Suppose e > 0. Then if R=(r1,...,Tm) is large, (i.e., 
each r; 2 c(ay,... ,Qmyt, 

é)). there exists a polynomial 


P(Xi,... Xm) € Z[X1,... , Xm] 
which is not identically zero and has multidegree £ R, such that the index of P at 


a = (a1, ... ,&m) with respect to Ris 2 t, and 


P| £ ((4h(a1))™ a .(4h(am)) r EO, 


Proof. Try 
rı Tm 
Pa Sots E dirala) X e Xr 


j =0 Jm=90 


= Sos) x%, 
J 


with J = (j1,-..,jm). The coefficients c(J) are the “unknowns” which need to be 
found. The number of coefficients (i.e., unknowns) is n = (r1 +1)+-+ (rm +1). We want 


Pl(ay,... 4m) =0 
for every I with £ € G(t). That is, 
ORO 
7 t1 tm 


This is a system of homogeneous linear equations in n unknowns. The conditions are 
parametrized by I with $ € G(t). 
Given such an J, denote the coefficient vector of the linear condition by a,. The 


components of a are 
Cc eee (2) aft it 5 najm atm 
a4 tm 
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For v € M(K) non-Archimedean, we have 
la, lo Ś max(1, Jaro)" +- max(1, lamo)". 


For v € M(K) Archimedean, each component of a, has norm 


S gritetrm [+ levy |2 asain + |am|? m 


and the number of components of a ; isn £ 2mt+rm, Thus for v Archimedean, 


T: T 
lalo $ Atte IH janl? e1 lame” 


Then 
Hyx(a,) £ 40 t)th e(a)" ++ hglam)™ 


and 
H(a,) S (4h(a1))™ ++ (4h(am))™. 


Recall, the number of unknowns c(J) is 
n=(rj +1) (Tm +1) N raro’ Tm 


Let k be the number of conditions (i.e., the number of I’s with $ € G(t)). For large R, 
we have 


k w rira: rmW (t). 


This can be seen in the case m = 2 by considering the following picture. 


So we have 1 
n—dk~ryrg°-+rm(1 — dW (t)). 
Therefore, 
dR dW (t) 
Pe Teme S 
for R sufficiently large. By Siegels Lemma 9A of Chapter I, there is a solution of 


bounded size. More precisely, there exists a polynomial P # 0 satisfying the index 
condition, of size 


[PT $ ((Ah(on)) (hlan) POF, 
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84. Upper Bounds for the Index. 


THEOREM 4A. (Roth’s Lemma (1955).) Suppose 0 < e < 1/12. Put 


gm-1 


w = w(m,e) = 24-27 ™(€/12) 
where m E€ N. Let R=(ri,...,rm) with 
WTh ŽTh4i (h=1,...,m—1). 


Suppose E EPON ae are rationals in reduced form with denominators satisfying 
m 


yp 223" = (h=1,...,m) 
and 

yt Zyn (h=1,..., m). 
Let P(X1,... ,Xm) € Z[X1,... Xm] be such that P # 0 and 


|P] Bug: 


Suppose P is of multidegree £ R. Then the index of P at (2, Se , im) with respect 


to R is at most €. 

See Roth (1955), Cassels (1957), or Schmidt (1980) for a proof. Roth’s Theorem 
may be proved either by using Roth’s Lemma or Theorem 4B below. Neither of these 
will be proved in these Notes. 

The proof is by induction on m. Here we will only consider the (trivial) case m = 1. 
Let P(X) € Z[X] and 2 a rational with gcd(x1,y1) = 1. We may write 


P(X) = (x = a) ma) 


where M (2) Æ 0 and £ is the order of vanishing of P at Ie We can also write 
P(X) = (nX - z1) Q(X) 


where Q (2) # 0. Since P(X) € Z[X] and (y1X — x1) has integer coefficients and 


content 1, we get Q(X) € Z[X] by Gauss’ Lemma. Thus the leading coefficient of P is 
divisible by yf and 


L<Tpl < „wr _ „Er 
v S |P| £ Yı Yı 
Therefore, 
£ 
Rad 2 E 
ry 


and the index of P at 4 with respect to R = (rı) is Ŝe. 

Note that this argument, using Gauss’ Lemma, is arithmetical. 

Now suppose P(Xy,... , Xm) € Z[Xi,... Xm] is a polynomial of multidegree £ R = 
(r1,-..,?m). The number of coefficients of P is (ry; + 1)+--(tm +1) ~ rira:::tm if 
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each r; is large. Suppose the index of P at (aj,...,@m) € C™ with respect to E is 
2 t. Thus P!(a1,...,@m). = 0 for every I with T E G(t, R). The number of such 
conditions is given asymptotically by ryr2---rmW (t, 8). Suppose @,...,a@, € C™ 
and the index of P at a, with respect to E is 2 ta (h = 1,...,k). Then the total 
number of conditions is approximately 


R 
rarm DW (tS) 


h=} 


If these conditions are independent and P # 0, then DE W (ta, R) should not be 
much larger than 1. 


THEOREM 4B. (Esnault and Viehweg (1984).) Suppose rı Z r2 Z +++ Z Tm, 
and let Qog E C™ with the condition that if a, = (@i1,... Qim), then 


auwfay if ifj (1 S¢-S4n): 


Suppose P(X1,:.. , Xm) E€ C(X1,...Xm) of multidegree £ R with P # 0, and the 
index of P at a, with respect to E = (e),...€m) is 2 th (h=1,...,m). Then 


k R m-1 m Hs 
< t i 
Dw (5) < I 1+(k =2) 32 a 
h=1 j=l i=j+l 
where k' = max(2, k). 

Bombieri (1982) did the case m = 2 before the general case was done. He called 
this Dyson’s Lemma, in reference to work done by Dyson in 1947. For the m = 2 case, 
the bound is slightly better, namely, 


k i _ 
Dw (45) <1+ (7) a 
h=l E 2 ry 


Viola (1985) gave another argument for the m = 2 case. He removed the condition 
ail É O51, Aig É O52 for i # j and imposed the condition that P(X1, X2) have no factor 
of the form Xı — c or X2 — c. 

Theorem 4B is algebraic in nature. The proof involves a lot of algebraic geometry 
and will not be given here. 

Now suppose K is a number field of degree d. Let a = (a1,... , am) € K™ with 


Q(a;) = K (i = 1,...,m) and 8 = (4j,...,8m) € Q”. We will apply Theorem 4B 
of Esnault and Viehweg with k = d+ 1 and a), by jaf), 8. The éth coordinates are 


a), wad 0), Be, which are all distinct. We will set tı =--- = tg = t and t441 =T. 
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If P(X1,...,Xm) # 0 of multidegree R = (ri,...,rm) has index 2 t at al? 
(i=1,...,d) and index 2 7 at 8 with respect to R, then Theorem 4B gives 


m~—1 m 
dw(t)+W(r) $ J| f14+-) E =F]. 
j=l i=j+1 Tj 
Suppose now that 
Titn <_1 EE 
ri 7 2dmdX Matamis) 
where \ #1. Then (since d 2 2) 


2 
oe Pee ea ey este ORE 
„L rj 2dmà 2dmàÀ 
i=j+1 
g2 
3dmA’ 
and we have 
—1 m-i r m-1 2 
[I (+@-9 © E) < T +g) 
j=1 i=j+1 j=1 


2 m 
< (1+ 555) 


1 
aS ca 


since \ 2 1. We have proven the following lemma. 


LEMMA 4C. Suppose P(X1,...,Xm) # 0 has coefficients in Q and has multi- 


degree $ R=(r1,...,Tm). Furthermore, suppose 
itl < 1 <L; < 
a N =tzim-1l 4.1 
ri ~ 2dmX (lt ) ey) 


is satisfied with \ 2 1. Suppose a, P are as above and t,t satisfy 


dW(t)+W(r) 214+ x: 


Then if P has index 2 t at æ with respect to R, it follows that P must have index < T 
at p with respect to R. 


Exercise 4a. Define r = r(k,t) to be the least integer such that given any k points 


Qog, in C?, there exists a polynomial P(X,Y) # 0 with coefficients in C, of total 
degree Í r, and vanishing of order 2 t at each of Qs- Q, Certainly, r S ro(k,t) 
where ro is least with 


e] 


Thus 


Emio EP < fk. 


Compute the following: 
t 
r(2,t) and Jim wn, 
as well as P 
r(3,t) and jim ne) ; 


Exercise 4b. Compute r(4,t) and r(5,t). 


§5. Estimation of Volumes. 
Recall from §3 that W(t) represents the volume of the set of (€,... , Em) satisfying 
mene rapes (¢=1,...,m) 


and 
Erte tém St. 


LEMMA BA. Ift £1, then W(t) =t™/m!. 
Remark. Ift £ 1, then G(t) is the set of (£1,... ,€m) satisfying 
OSG and &4+---+ém St. 
The lemma may be obtained by induction on m. 


LEMMA 5B. Ift = % — 6 w;here0 <6 < ™, then W(t) < e7®/™. 


Remark. Consider the cube C consisting of points (€1,...,ém) with O S @ £1 
(i =1,...,m). Lemma 5B says that those points satisfying 
m 
€rt---+ém Ê 27 
form a small: proportion of C and similarly, by symmetry about (3,..., 3)s for those 


points satisfying 


fi +---+&m 2 F +e. 


Thus most points satisfy 


ittim- £8, 


which agrees with probability. 
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Proof. Set q = 36/2m, so that q £ 3/4. For points in G(t), we have 


Ei Heee thm - 


Then a 
-g (£1 +-+ + bm = E) 2 Og = 36 /2m. 


Consider 
Waem < [- f exp (~q (1 +-+ Em — =) ) dér -+> dém 


-({ =(- (e -3))%) 
0 
=]™, 
where J is the integral on the right-hand side. By a change of variables, we get 
1/2 
I= J exp(—q£)d& 
-1/2 


e1? = e7112 


Then 
W (tje? /2m < ef lim 
and 
W(t) < e75 hm < ePm, 


§6. A version of Roth’s Theorem. 
Let œ be algebraic of degree d and 8 = 5 We will consider inequalities of the type 


læ — p| < 1/8} +. 
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Given C > 1, a window of exponential width C will mean an interval of real numbers 


£ of type 
wÍ <w? 


where w > 1. 


THEOREM 6A. Suppose « is algebraic of degree d 2 3. Suppose 1 < m! Š d 
and0 < x <1. Set \ = 2d(6/x)™ and cm = m/(m!)!/™. Then the rational solutions B 
of the inequality 


ja 6] < ay ee Oe (6.1)P 
have their heights in the union of the interval 

h(B) < (8h(a))°/* = Bi, 
say, and at most m — 1 windows of exponential width C where 


C = 6dm = 12d?m(6/x)”. 


Remarks. In the case m = 2, we get cmd!/™ = V2d. Thus if /2d < H< 2 /2d, 
there is a x with = cmd!/™(1+ x) and 0 < x < 1. This implies the Dyson-Gelfond 
estimate. In this case, we have at most 1 window. 

In the case m = 3, we get cm = 3/76 = */9/2 and p > */9/2- Wd. In this case, 


there are two possible windows. 


THEOREM 6B. Suppose a is algebraic of degree d 2 3. Suppose 0 < 6 < 1 and 
m = [(25/65)* log 2d]. Let A = 2m!. Then the rational solutions § of the inequality 


la — B| < h(B)-?-* 
have their heights in the union of the interval 
h(B) < (4h(a)) >? = Ba, 
say, and at most m — 1 windows of exponential width C = 6dmA. 


Remark. Theorem 6B contains Roth’s Theorem. 
Let a be given in a number field K with deg K = d and K = Q(a). Furthermore, 
let 8 € Q and à > 1 be given. We introduce the mized height of a and £, given by 


hx(a, B) = (4h(a))* - 4h(8). 


We will now state the main theorems. 


( fe ce: 6C. we (a1, f1),-.. ,(am, m) be such that Q(a;) = K and p; € Q 
a7=1,...,m). Suppose that 
(i) 1<m! Sd, 
(ii) à 2 2d-4™, 
(ii) Jæ; — Bil < halai, Bi TEn T OHRA (f= 1,... ,m), 
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(iv) ha(oiss, iti) > halai, pi) (i =1,...,m—1). 
This is impossible! 


THEOREM 6D. Let (a1, f1),... , (am, m) be as above, and suppose that 
(i) m > 36 log 2d, 
(ii) A 2 2m!, 
(iii) Jæ; — Bi| < halai, Bi) 271 V 08 24)/™ (i = 1,... ,m), 
(iv) as above. 
This is impossible. 


Remark. Theorems 6C and 6D give Theorems 6A and 6B, respectively. 
First, we will verify that Theorem 6C implies Theorem 6A. Consider solutions f of 
(6.1), i.e. 
ja — p| < ap) 4), 
Let Bı = (8h(a))®/*, and suppose h(8) 2 By. Let à = 2d(6/y)™. Then x = 
6(2d/)!/™ and (6.1) implies that 


ja = Bl < ALB) om OH a 
< hBr E OHRA (Qh (ey) ema 2A 


<W re SS), 


since 2 > 144(2d/)!/™ and 8 > (4ò)-4. If there is no approximation £ with h(@) 2 Bi, 
then we are finished. Otherwise, let 3, have minimal height with (8) 2 Bı. If each 
such 8 has (8) < h(f;,)®*™, then they all lie in a single window and we are done. 
Otherwise, let 82 have minimal height with h(@) Z h(f1)®¢™. Then 


hy(a, B2) > h(B2) 2 h(ß1} 2> y Bidmd 
> (h(B1) - (8h(a))*)24" 


2 halan, bı PIA, 


Continue in this fashion. If the solutions with h(@) 2 B, do not lie in m — 1 windows, 
then (1, 62,... , Êm can be found, and Theorem 6C with a; =a (i = 1,...,m) gives a 
contradiction. 
Next, we will show that Theorem 6D implies Theorem 6B. Suppose that |æ — £| < 
n(B)-2-6 and 
h() 2 Be = (4h(a))?>™6. 


With m, \ as in Theorem 6B, we have 


log 2d 6\? ô 
<{z) > v (log 2d)/m > 57. 


m 
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We infer that 


la — | < MB)? (py? 
ee 6 m p712+/ (log 2d)/m 
< h(B) 2—12 «/(log 2d) / B, vV Uog 


< h8)! log 2d)/m (4h(a)) 73> y (log 2d)/m /6. 


But since 


2 
ese (=) , Vlog 2d) /6 > =, 


m 


we obtain 


ja = bl < Alp) 72712 V Gon 24)/m (4h(a)) 71> 
< hy(a, p)??? vy (log 2d)/m 


We now proceed as with Theorem 6A: If there is no approximation # with h(8) 2 
Bo, then we are finished. Otherwise, ... . 


87. Proof of the Main Theorems, i.e., Theorems 6C, 6D. 
Suppose (a4, /1),... ,(@m, m) are given such that Q(a;) = K and £; € Q (i = 


1,...,m). Pick t = t(m, A) such that dW(t) = 1 — (1/A), and 7 such that W(r) = 2/d. 
We will use Lemma 3A to construct a polynomial P(Xi,... , Xm) of multidegree R = 


(r1,--.,7m), Where the r; are large, such that P has index 2 t at (a1,...,am) with 
respect to R. We can choose r1,... ,7m such that 
Ti+i < 1 y 
— i z Siara —1). 
a a a 


Then by Lemma 4C, we know that P will have index < 7 at ((;,...,8m) E Q. Thus 
there exists an I = (t1,... ,2m) with 


ps pupii ge (7.1) 
r 


such that P/(@) # 0. Set Q(X) = P!(X). Then by (3.1) we have [PTI = [Q| < 2"|P| 
where r = rı +--+ + 1m. Moreover, by (3.2) we have [Q7] £ 3"|P| for any J. Since P 
had index 2 ¢ at g, it follows from (7.1) that Q = P’ has index 2 t— r at a (with 


respect to R). Since 
dw) _1-0/A _ 


mawa ap) ~*~ 
we have 
dW(t) 
Toawi awe)" +e)<X 


for € > 0 sufficiently small. Then by Lemma 3A, the polynomial P can be constructed 
such that 


[P| S ((4h(a1))" ---(4h(am))"™)*. 
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Then — L 
IQ7| $ 3"[P| $ 3°((4h(a1))" ---(4h(am))"™)?. 


Writing £; = z;/yi (i = 1,... ,m), we have yf! - -ym Q(B) € Z\{0}. Thus 


” 1. 


Yi Ym’ Q(B) 


Writing the Taylor’s expansion for Q about a gives 


Qa= E (A-a) (Bm — am)” Qa) 
FG yee sim) 
Fh te im 2 t-r 


where the sum is restricted to a +- -4 im 2 t—r since all lower order partial derivatives 
vanish at œ. By condition (iii), we have 


la; bil <1/2  (i=1,...,m), 


which gives 
1_ {ei} 1 _ lzl+Q/2)lyil ,, 
š . - = |— — = = 1, weeny . 
lal < Bl +5 = [+5 7 (i m) 
Applying Cauchy’s inequality to the right-hand side gives 
4/5/4 h(B; . 
Jas] < EMA) (i=1,...,m). 
t 


Now we have 


107(a)| £ ("| (ite + n) (e) = (eae) 
7 Yi 


Ym 


$3" | (iG + n) (ü (v3 ro) J 


i=1 i=1 


i=1 


and therefore 


yit yw Q (a) 
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The Taylor expansion for Q gives us 


uf -vuze Q(B) 


= i=) 


—am|?") 
ee To, 1) m si ) i Jy ar — Bil -Jam 
| (Th $ *) (TL y Yay mms tee 

—Bm|?). 


So for r; large, 


uP vir Q(B) i (irene) [PI max, (le = i|” + lam = Bl") 
i=1 i = fcr 
and 


1 (Toese k a E a Se ee es) (7.2) 


i=1 


Now suppose that 
|b: — ai] £ hala, pi)? 
with y > 0. Then (7.2) yields 


1$ (Toese i iS z (hx(ar, B1)72*” +++ hy(m; Bm) 7 *). 


Following Bombieri and Van der Poorten (1987) we take logarithms of both sides. Write 
= log hy(a;, Bi), (i =1,... ,m). We then obtain 


OSmLi+-::+1rmlm-pY min (jili +++: +jmDm). 
th om 2 t-r 


Putting y; = ji/ri (i = 1,... , m), we have 


((7.3)) rili +- rmim 2 ý% min (yiri Li +- OmrmLlm). 
git: tem Z t-r 
piE 


Now choose r; = [L/L;] (i = 1,...,m) where L is large. By condition (iv), we have 
Li41 > 3dmAL; (i =1,... , m — 1), so that 


Ti+1 1 


r; JdmÀ (¢=1,...,m—1). 
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Dividing (7.3) by L and letting L — oo, we get 


m2 min (pı +++: + 9m) = v(t —7). 
pit tpm 2 t-r 
giER 
Hence 
< 7.4 
= (74) 


This is the key inequality. Using it together with estimates for m, t, and 7, we will prove 
the two main theorems. 

Recall that dW(t) = 1 —(1/A) and W(r) = 2/À. In Theorem 6C, we have (i) 
1 <m! Í dand (ii) \ 2 2d-4™. This gives 


and 


and ji 
WS 
27” 
A m?” 
r = (mi) (2/a)1™, 


So we have 


t ~r = (m!) "d1" ((: 5 p = ajay) 


2 (mi) (h = x) = (2/11) 
= (m!)'/™a-V™(1 — n) 


where 


n = (1/A) + (24/A)"/™ 
we 2(24/A) (7.5) 
$ 1/2 (by (ii)). 
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Observe that 


S142nf1t4(2d/r)/™ (by (7.5). 


Therefore 
t=T galm!) T + 4(2d/d)/™)—?, 


Now we are in a position to estimate y% with 
[Bi — ail £ ha(ai, Bi)”. 
By (7.4) we have 


A 
3 


t-r 
m 


< EEE AE 1/m 1/m 

» (miim t (1 + 4(2d/A)'/™) 

= Cmd'/™(1 + 4(2d/d)'/™). 
However, in Theorem 6C (iii), we have 


which gives a contradiction. 

We now turn to Theorem 6D. As before, dW (t) = 1 —(1/A) and W(r) = 2/A 
< 1/(m!) by (ii). Since W(r) is an increasing function of 7, we may infer from Lemma 
5A that 7 $1. Next, d Ž 2 gives us W(t) < 1/2. Therefore t < m/2, say t = (m/2) —6 
with 0 < 6 < m/2. Then by Lemma 5B, 


Wit) < ere 


1 1 —67/m 
d (1 5) <e P 


so that 


Now we have 


$ 2d (by (ii)). 
Taking the logarithm of both sides gives 


0 < V/mlog 2d 
t > (m/2) — /mlog 2d. 
t-r >(m/2)-1— /mlog2d. 


Now we may return to our task of estimating ~. By (7.4), 


and 


So 


< Mm 


= t-r’ 
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so that we obtain 
m 


SS ve ee 
= (aie mlog 2d 


2 
~ 1—(2/m) — 2./(log 2d)/m 
2 


ï 1 — 3 y (log 2d)/m 
< 2(1+6/(log2d)/m) (by (i)) 
=2+12+/(log2d)/m. 


But Theorem 6D (iii) is 
lai — Bi | < hy(aj, bi) 72712 V 08 2B /m 


which gives the desired contradiction. 


88. Counting Good Rational Approximations. In the summer of 1987, in 
the course of a number theory conference in Budapest, A. Schinzel asked the following, 
almost philosophical question: “But how can it be, how can it be in number theory, that 
one could prove the finiteness of a set of natural numbers, without being able to give a 
bound for its cardinality?” The next day he himself provided the following explanation. 

Suppose we are given a set S of positive integers and suppose we can prove that if 
y,y’ are in S, then y' Ê 2y. Then S must be finite. However, unless we know at least one 
y in S, we are unable to estimate the cardinality of S without further information. We 
will generalize Schinzel’s remark as follows. Given C > 1, a set S of positive numbers is 
a C-set, if for any y,y’ in S we have y' Š Cy. Any C-set consisting of integers is finite. 

Now let y > 1 be given. 

A y-set is a set of positive real numbers with the following property: if y,y' are in 
the set and y < y’, then y’ 2 yy. Thus a y-set has a certain “Gap Principle”. A set 
which is both a C-set and a y-set will be called a (C, y)-set. Its elements are positive 
real numbers, not necessarily integers. 


LEMMA 8A. Suppose C > 1 and y > 1 are given. The cardinality of any 
(C, y)-set is 
Í 14 (log C)/ logy. 


Proof. Let C > 1 and y > 1 be given. Suppose yo < yı < y2 < `: < yy belong to 
a (C, y)-set. Then 


and 

Cyo Ž Yv 2 yoy’ 
Therefore 

v $ (log C)/ log v, 
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and the cardinality of the (C, y)}-set is 
S$ 1+ (log C)/ log. 


Suppose ô > 0. Let 
L = log(1 + ô). (8.1) 


LEMMA 8B. Let a real number € be given. The number of reduced fractions 
(z/y) with 
1 


< yat (8.2) 


k- 
y 
and y in a window of exponential width C is 


S$ 1+ (log C)/L. 


Proof. If y,y' are in a window of exponential width C, then y' $ y°. We will call 
this an exponential C-set. Now, if z/y,z2'/y' satisfy the hypotheses and z/y # x'/y’, 
say y' 2 y, then 


1 clr r 
yy “ly y 
t 
EE 
(= | y y' 
1 1 
< Qy2+6 Qyi2+6 
LERN 
= 2+6 
and we have 
y > yt =y, 


where y = 1+6. We call such a set an ezponential y-set. The logarithms of the numbers 
y form a (C, y)-set. By Lemma 8A its cardinality is 


< ipie 4 log C 
log y L 


Suppose now that 1 < A < B are given, and consider rational approximations to € 
with 
1 
2y2+ê 


E- -i< 


y 


and AS y Š B. The denominators y lie in a window of exponential width C = 
(log B)/ log A. Therefore, Lemma 8B says the number of such y is 


<14 log(log ail log A) 
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There are a couple of drawbacks to Lemma 8B. We cannot let A go to 1, since 
C = (log B)/ log A. Secondly, we have a 2 in the denominator in (8.2). We will try to 
remove these drawbacks. For 6 > 0, we will call s/y a é-approzimation to € if y > 0 
and (x,y) = 1 and 

1 
Spe 


le -Z (8.3) 


Then we have the following results. 


LEMMA 8C. The number of -approximations to € with y in a window W S y 
< W® where W 2 4/8 is 
£14 (log2C)/L. 


Proof. Suppose z/y and z'/y' are such approximations and y' 2 y. Using the 
same argument as above (i.e., in the proof of Lemma 8B), we get 


yey **/2. 
Then 
logy’ 2 (1+ 6) log y — log 2. 
Now suppose that xo/yo, £1/Y1,--- ,tv/yy are such approximations with yo Sy S 
+++ S yp. Then 


logy 2 (14 6) log yo — log 2, 
logy2 2 (14+ 6)logy — log 2 
2 (1+ 6)? log yo — ((1 + 6) + 1)log2, 


logy, 2 (1+ 6) logyo — ((1 +8)? +--- +(1+6) + 1)log2 
> (14 6)" (log W — (log 2)/6). 
Since W 2 41/5, we have 
log W 2 (log 4)/é = 2(log 2)/6 
and 
logy» 2 (1+ 5)” (log W)/2. 
We also have y, í WY, so that 


Clog W 2 logy, 2 (1+ 8)” (log W)/2. 


Thus 
(1+ 6)” $2¢ 


and 


v Ê (log 2C)/L. 


The lemma follows. 
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THEOREM 8D. The number of 6-approximations with denominators y S$ B, 
where B 2 e, is 
< L™ log log B + 20((1/5) + 1). 
Recall, L = log(1 + 6), as in (8.1). 
Thijs result, as well as Theorem 8E below, is due to Mueller and Schmidt (1989). 


Proof. We will say that “large solutions” are those with e?/6 < y $ B. These form 
a window of exponential width 


log B ê 
=—— = >l f 
C gen = 9 og B 


By Lemma 8C, the number of solutions here is 


<14 log(é log B) 

L 
_ loglog B log ô 
sae a e 
z log log B +2, 


L 
since L = log(1 + ô) > log ô. 
We will let “small solutions” be those with y < e?/®. Given an integer u, let the set 


. Sas 2 T T z 
S(u) consist of -approximations with e” S y < e“t'. Suppose g gika m 
Yo yı “ 

are elements of S(u). Any two consecutive elements cannot be too close, since 


Ti 3 1 : 
minn L ne erase (i =0,..., 4—1). 
Yi+1 Yi YiYi+ıi 


We may conclude that 


Zenan Sie 2 
Yu Yo 
On the other hand, we have 
w 22s le ahe- 
Yu Yo Yo Yu 
ie 1 pe 1 
yor? Cy 
S96 82h"), 


2—uő 


Combining these two estimates, we get p < 2e , and hence 


card (S(u))=1+p < 1+2’, 
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The “small approximations” lie in the union of S(0), S(1),... , S(k) where k = [log e?/9} < 
2/6. So the total number of “small” -approximations is 


k 
s Soa + 2e?-™) 
u=0 


<k+1+2e? Sve" 


u=0 


< : +1+2e?(1— es) 


1 


Adding the estimates for the numbers of “small” and “large” 6—approximations, we 
get a bound which is less than 


lA 


L™ log log B + 20((1/5) + 1). 
The following theorem tells us that the main term in the conclusion of Theorem 


8D (i.e., L7! log log B) is indeed best possible. 


THEOREM 8E. Let 6 > 0 be given. There exists a real transcendental number 
€ such that for every B 2 e, the number of 6-approximations with y S$ B is 


> loglog B i log(8/2) 
= L L ’ 


The proof uses continued fractions. For an introduction to continued fractions, see 


Hardy and Wright (1954), or Cassels (1957), or Schmidt (1980). 


Proof. Given natural numbers aj,... ,@n, write 
1 — Pr 
1 q 
a) + = 
a2 + 
1 
an 


with qn > 0 and g.c.d. (pa, gn) = 1. We will define a sequence a1,a2,... inductively. 
Set a; = 1. Given aj,... ,Gn, let an41 be the least integer witht 


Gn4+19n + Gn-1 Ž ager? 


The factor 2 on the righthand side is not needed for our present purpose but will come in handy 
in §9. 
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Then since (by the theory of continued fractions) gn41 = an+14n + qn-1, We have 


oan” SaaS Sat: 


Again from the theory of continued fractions, the limit limpooPn/gn exists. Denote 
this limit by €. It is customary to write 


at. 


and to interpret the right hand side as an “infinite continued fraction”. 
It is known that 


Pn 1 
k meses ne , 
dn dn Qnt+1 
so that in our case 
k Ee EN (8.4) 
qn Tarte 


We would like to know how fast these denominators grow. We have 


qm =1, log qi = 0, 
q 3t, logg Í log, 


qs Í 34t,  loggs S ((1 + 8) + 1)log3. 
For arbitrary n, the inequality is 
logdn S$ ((1 +6)" 7 +--+ + (1+ 6) + 1)log3 


1+6)? 
< CED og, 


So the number of qn Ê B is at least N, where N is the largest integer with 


6)N-1 
Gto tops $ log B. 
Then 
1+ 6)% log B 
PN Pa E 
and 


N > L` log(6 log B/log3) 
> L`! loglog B + L™' log(6/2). 


Since we have infinitely many 6-approximations, we know that £ is transcendental by 


Roth’s Theorem. 
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Exercise 8a. The number of approximations x/y in reduced form with 


é z “i 1 
y| yy? logy 
and 2 Š y Í Bis 
log B 
< 
E (Feo zloeE 


when B > Bo(e). 


§9. The Number of Good Approximations to Algebraic Numbers. 


THEOREM 9A. Suppose that a is algebraic of degree d 2 3, that m satisfies 1 < 
m! < d, and that p = cmd'/™(1 +x) where x > 0. Then the number of approximations 
z/y to a with 


1 
a- Ž|<— (9.1) 
y| y" 
is ¢ 
log™ log h(a 
< 1, Wt losh(a) « 
m,x logd 
t 
where < means that the implicit constant may depend on m and x. Furthermore, the 


mx 
number of such approximations with y 2 h(a) is 


<i. 
m,x 


Consider the case m = 2. Then, e.g., u = 3Vd/2 has p = V2d(1 + x) for some 
x > 0. Theorem 9A says the number of approximations satisfying 


T 1 
“y S y3vd /2 


logz if z>1, 


logt z = 
a n if 0<zí1 
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is bounded as indicated, i.e., 


+ 
Paes log™ log hla) 
log d 


Proof. We may suppose that 0 < x < 1. Then y > cmd!/™ 2 cm3!/™ > 2.1. As 
before, we distinguish between “small” and “large solutions”. 
“Small solutions” will be those satisfying 


y < (8h(a))!?4/* = By, 


where À = 2d(12/x)™. By Theorem 8B with 6 = p — 2, the number of such approxima- 
tions is EES i 
og log 

— 56 + —— +1 

log(u—1) p-2 

log log Bı 

——_ +1. 

i logd i 


(Here and in the rest of this section, « means <.) 
m,X 


We will estimate log log By. Since 


À 
log Bı = — log(8h(a)) 


m+1 
=2(~) log(8h(a)), 
X 
we have that 
loglog B, < logd + log* log k(a) + 1. 
Thus the number of “small solutions” is 


m log* log h(a) 


1. 
ga K 


Now consider the “large solutions”, i.e., those satisfying 
y 2 (8h(a))!?*/* = By. 
We will write 8 = 2/y. We have from (9.1) that 
|z <ļaļ+1, 
y 


so that 
h(8) < (la] + 2)y 


Í 3h(a)4y, 
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since |a| £ h(a)4. Now consider 
y“ = yom ™ (+x) 
_ yom! ™(1+(%/2)) yom dl! (x/2) 


2 h( b) AAD (3h(a)2)~2 yX/? emai 


Since 
y > (3h(a)*)*/*, 
we have 
(3h(a)4)?yX/? > 1, 
so that 


y" > h(pyemt TOH), 
We therefore obtain 
jla- pl < AO n (x12) 


Apply Theorem 6A with x/2 in place of x. Then we have either h(8) < Bı, or h(8) 
lies in the union of at most m — 1 windows of exponential width C = 6dmA where 
à = 2d(12/x)™". We also know that the first case is ruled out because h(8) 2 y 2 By. 
Therefore, by Lemma 8C, the number of solutions is not greater than 

log 2C 
log(u — 1) 


We know that 2C = 2d?(12/x)™m, so that 


1+ 


log 2C <« 1+ logd. 


Also, 
log(u — 1) > log y > logd. 


So the number of solutions in such windows is < 1, and the main statement of the 
theorem is proven. 

We have already seen that the number of approximations with y 2 Bı is < 1. It 
remains to count the solutions with y in the interval 


h(a) Sy $ By. 
Recall that 
By = (8h(a))'?/* 
and 
A = 2d(12/x)™. 
Then 


By = (8h(a) )407/0™, 
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If h(a) X 41°, then the second assertion of the theorem follows from the first. If h(a) > 
41° then h(a) > (8h(a))'/?. Our interval is a window of exponential width not greater 


than 
log Bı 


Jog(8h(a))7 
We also have ô = p — 2 > 2.1 — 2 = 1/10, so that 41/8 < 41°., Then by Lemma 8C, the 
number of approximations is not greater than 


log(8d(12/x))™*" 
log(u — 1) 


= 4d(12/y)™*?. 


1+ 


which is < 1. 


THEOREM 9B. Suppose a is algebraic of degree d 2 3, and 0 <6 <1. Then 
the number of 6-approximations to a is less than 


* log h 
fog tea a) + c(d, ô), (9.2) 


where 3 
8 
e(d, 6) = “Flog 2d)? log (2) log 2a) 


This theorem estimates the number of “exceptional” approximations in Roth’s The- 
orem. Davenport and Roth (1956) had given an estimate with a summand exp(70d’6~*). 
The latest results are by Bombieri and Van der Poorten (1988) and by Luckhardt (1989). 
Both use the Theorem of Esnault and Viehweg. 

In Theorem 9B, the first term in the estimate is best possible (see Theorem 9C 
below), but the c(d, 6) term can probably be improved. Actually Bombieri and Van der 
Poorten had 3000 in place of 10°, but they also had 


| x 
a — — 
y 


rather than 6-approximations, and had 6/2 in place of L = log(1 + 4) > 6/2 in (9.2). 


1 
= 64y2+6 


Proof. Put m = [(50/6)? log 2d] and à = 2m™. Then à > d. Consider “small 
solutions” to be those with 
y Í (4h(a))P/> = Bo. 


By Theorem 8D, the number of such approximations is not greater than 


log log B2 1 
ear eed + 20 (G + 1) ; 


Estimating the first term, we have 


A 
ide Bis se log(4h(a)), 
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so that Si 
loglog Bz Í log ras log A + log log 4h(a). 


We know that 


2 2 
logA <14+mlogm £14 (2) 1o24) log ((2) 1o24) : 


log log 4h(w) Í 1+ logt log h(a). 


and also 


So the total number of “small solutions” is not greater than 
logt log h(a) | 2((50/5)? log 2d) log((50/5)? log 2d) 
2 yy See eee 
L 6/2 
Now consider “large solutions” to be those with y > B2. As in the previous proof, 
if 8 = z/y, then h(B) < 3h(a)*y. Consider 
y? ti = y2+(6/2) 45/2 
> MP) FC (3hla)) ty" 
> h(p)?* 6/2), 


The last inequality follows because y > Bz and À > d. So 


7 x x 1 
y?tê 
yields 
1 
la— Bl < re: 


Apply Theorem 6B with 6/2 in place of 6. Then either h(8) < B2 (which in our case 
is ruled out) or h() lies in the union of at most m — 1 windows of exponential width 
C = 6dmi. Notice that Bz 2 47/°, so we can apply Lemma 8C with 6/2 in place of ô. 
This tells us that the number of approximations with h(#) in the given window is not 


larger than 
log 2C 


4 
— PERT S =log2 1 
log(l + (6/2) 11 = 5 0820 + 
< 2 log 2C. 
We will estimate log 2C. We know that 2C = 24dm™*?, so that 


log 2C Í 2mlogm + log 24d. 
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Then the number of approximations in a given window is 


WA 


1 
H mlogm + 2 log 24d 


ge l 
g milogm. 


The number of windows is less than m, so the total number of “large” approximations 


is less than 
4 2 
{ 
Hm? logm Í 4 (=) (log 2d)? log (2) log 2a) $ 
Combining the results for “small” and “large solutions” gives the desired bound. 


THEOREM 9C. Let K be a real algebraic number field of degree d 2 2. Let 
6 > 0 be given. Then there are infinitely many a € K, with K = Q(a) and h(a) 2 e, 
such that the number of -approximations to a is greater than 


log log h(a) _ Ge: 
L 
Remark. We could drop the condition that h(a) Z e if we replace log log h(a) 
with logt log h(a). 
This result is due to Mueller and Schmidt (1989). 


Proof. We may choose y € K with Q(7) = K and |y| < 1/2. We also construct 


n 


the sequence {2} as in Theorem 8E. Given N 2 1, let by be the least integer such 


qn 
that by 2 427°. Then we may pick an integer ay with 


9N _ PN] to <_ 1 
by qN 2bn = 202787 
Set 
an 
an =Y- 7. 
N bN 


Then ay generates our number field. 
Suppose n satisfies 1 S$ n £ N. Then we have 


anea Ee s lage ae EN ee 
qn qN IN Qn 
1 1 
Zan GnQn+1 
1 1 
< 
= — O15 + ATTER 
qn? gn 


1 
6 2 
ant 
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where the last inequality follows from the construction of {zal in §8. (Recall, we had 
q 


n 
Qn+i 2 2qi+*.) So for 1 Š n Š N, we have that pn/qn is a 6-approximation to an. 
Hence, ay has at least N of these 6—approximations. 
Now we seek a lower limit for N in terms of h(ay). We have (see Exercise 7C in 


Chapter I) 
h(an) =A ( = z) 
$ V2h(y)h (#4) 
Furthermore, 
aNj| < PN 1 < 
ZAX PEN OED, 
bn | hl H 2° 


so that its height satisfies 
aN X /2 2 < 
a(S) = ai, + 83, S V5 by. 


h(aw) $ VIO h(y)bn 
< 2V10 h(y)ant? 


since by was the least integer with by Z q2+*. Taking logarithms gives 
g IN 8 10g; 8 


log h(a) $ (2 + ô) log gu + c'(K), 


We than have 


where the constant depends only on K at this point. By our construction in §8, we 
know that 


< +87 
6 


logqn = log 3. 


Therefore, 
log h(a) $ e"(K,6)(1 + 8)" 
and 


Nz — co( K, 6). 


log log h(an) 
L 
§10. A Generalization of Roth’s Theorem. 


Let Q C k C K be algebraic number fields. As in §1.6, let M(K) be an indexing 
set for absolute values of K. We write 


M(K) = Mo(K) U Mæ(K), 
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where M,(K) consists of non-Archimedean, and M, of the Archimedean absolute val- 
ues. In most parts of these Notes, S will denote a finite set of the type 


M.(K) c Sc M(K). 


However, in the present section we need only that S$ is a finite subset of 
M(K). Suppose that for each v € S we are given a linear form L, = L,(X,Y) with 
coefficients in K. We will study the inequality 


Il ae. < H,(x)7?~* (10.1) 


n 
ves xļv” 


in unknowns x = (z, y) with components in k, where |x|, = max(|z|v, |ylv) and where 
ny is the local degree. Since (10.1) is unaffected if we replace x by a multiple, x in 
projective space P}(k). 


THEOREM 10A. Given ¢ > 0, (10.1) has only finitely many solutions x € P1(k). 

This is due to Lang (1962). Earlier generalisations of Roth’s Theorem were given 
by Ridout (1958). Now why does this actually give Roth’s Theorem? 

Suppose a is algebraic, and suppose 


T 1 
a- -|< = (10.2 
y| ` lyl?* l 
Then 
lay — z| < Cy|x|7*~* 
with a constant C,. Further if a = a,... ,a(® are the conjugates of a (in C), then 


Jay — a|- +- Jay — 2] £ Co[x|P-P*. 


Set L,(X,Y) = aY — X for each v Then if k = Q and K = Q(a), we have 


n 
D Dg = [lay = 2} $ Cala? = CoHia(x)?*, 


or 


Lo(x)|t 

| Gle < C2H,(x)?-*. 
v€Moo(K) x|» 

By Theorem 10A this has only a finite number of solutions, x € P!(Q), so that (10.2) 
has only finitely many solutions z/y. 


A more quantitative version is as follows. 


THEOREM 10B. Suppose K is of degree 6. Suppose these are not more than t 
distinct forms L, for v € S. Define |L,|, and Hx(L,) in terms of the coefficient vectors 
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of Ly, and suppose that Hx(L,) £ H for v € S. Then for given C > 0, the number of 


solutions e 
II ( eee | < CHx(x)?~* 
ves |Le ixte 


in x € P?(k) with 
Hy(x) > c1(6,t,e)(C + H + 1324) 


is less than 
c3(6,t, €)ca(e)? 


where s = Card S. 

As Evertse, Györy, Stewart and Tijdeman (1988) point out, this theorem can be 
proved by making Lang’s arguments more explicit, and combining them with ideas of 
Davenport and Roth (1955). But no explicit proof of Theorem 10B has been published. 

The following exercises are not on the material of this particular section but could 
have been given earlier. 


Exercise 10a. Let B be a symmetric convex body in R” and A a lattice. The 
inhomogeneous minimum of B with respect to A is defined as the least ys such that 
A+B covers R”, (i.e., every x € R” may be written as l +x with £ € A andy € uB). 
Prove that u is well-defined, and that 0 < u Ê nàn/2, where Àn is the nth minimum. 

For n = 2 and B a disk centered at the origin we have the following picture. 


CREE 
I PRAXE 
REY 


(es Ay eae es ae Se oe 
y J V 


Exercise 10b. Let a € R be irrational. We call (z, y) € Z? a best approzimation to 
a if y is positive, |ay— z| < 1/2, and if for any other pair (z', y') with 1 Ê y' S y, we have 
lay' — z'| > jay — z|. Show that one gets an infinite sequence of best approximations, 
say (£1, Y1), (2, y2),---, with y1 < yo <--> and 


1 ; 
Jays — ai] S (i = 1,2,...). 
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Exercise 10c. Let a € R be irrational, and for N 2 1, let II(N) be the parallelo- 
gram 


1 
lay -2| $ go WSN. (10.3) 


Since the area of P(N) is 4, Minkowski’s Convex Body Theory says that the first 


minimum satisfies 4; = 41(N) S$ 1. Show that there are arbitrarily large values of N 
with A2(N) S 1. (Hint: This should follow from Exercise 10b.) 


Exercise 10d. Combine Exercises 10a and 10c to show that if a, 8 € R, where a 
is irrational and £ is not of the type 8 = ma +n with m,n € Z, then there are infinitely 
many (z,y) € Z? with y > 0 and 


lay — 8 — z| < 1/y. 
Remarks. Exercise 10d is a quantitative version of the one-dimensional “Kro- 
necker’s Theorem”, which only asserts that we can solve 
lay- 8-rzr|<e 


for a irrational. Minkowski proved that (10.3) may be replaced by the stronger inequal- 
ity 

lay — B — z| < 1/(4y), 
which is best possible. (See Cassels (1957), Chapter III, Theorem II A). 


III. The Thue Equation. 
References: Thue (1909), A. Baker (1968), Bombieri and Schmidt (1987). 
§1. Main Result. Let F(X,Y) = aX? +a, X? tY +--- + a4Y? with a; € Z be 
a form of degree d 2 3 which is irreducible over Q. 
Remark. Such a form F can never be irreducible over C. First consider 
F(X,1) = ap X +a, X?! +--+ aa = (X — ai) (X — aa) 


with a1,... ,a@q@ algebraic of degree d and conjugates of one another. Then 


F(X,Y) =YŻF ($1) = ao(X — aıY)--- (X — aaY). 


Theorem 1A. (Thue, 1909). Let F as above and m be given. The equation 
F(z,y) =m (1.1) 
has only finitely many integer solutions (x,y). 


Remark. Today, equations of type (1.1) are called Thue equations. 
Remark. Theorem 1A is false for d = 2. Consider, for example, 


zr? — 2? =1. 
This equation factors into : 
(z + V2y)(e — V2y) = 1. 


If © = Z[V2], i.e., the ring of elements r + V2y with z,y € Z, then € = z + V2y 
and ê = z — /2y are units in O. In particular, we can take z = 3 and y = 2. Then 
£o = 34+ 2V2 is a unit. For each n 2 1, the number £p is also a unit, which gives a 
solution E? = £n +V2 yn to x2 —2y? = 1. For example £} = 17+ 12V2, so that z2 = 17, 


y2 = 12. 
Proof. Factoring F(x,y) over C, we can write 
a(x — ary): - (£ — aay) = m. (1.2) 


Then dividing by yf and taking absolute values gives 


T z m 
Jaoljæi — =| --- Jaa - =| = z (1.3) 
yl iy? 
We have, without loss of generality, 
jz-—ayy]}= min jz- a;yl, 


1si<d 
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which is the same as 


am - =| = min els 
y| isgigd y 
Also, let 
Li | a;|>0 
= —minla;—a; ; 
Y 2 j 
If y is large, then both sides of (1.3) will be small. In particular, |œ — 5 will be small. 
For i # 1, observe that 
ai= =| Z l-al- lai- Z 2y-7=7 
Then we have 
r| < m c 
œ — =| $ |—— | = —. (1.4) 
apy ty? lyl4 


Since d 2 3, Roth’s Theorem implies that there is only a finite number of solutions 
(x,y). 


Exercise la. The proof of Liouville’s Theorem 1E of Chapter II uses implicitly that 
|F(x,y)| 2 1 for integers z,y. (Actually, it uses that |P(z/y)| 2 1/y? for a polynomial 
P € Z[X] of degree d which does not vanish at z/y.) Employ Roth’s Theorem to show 
that for F as in Theorem 1A, 


IF(x,y)| 2 co(F,€)(max(le|, lyl)*?-* > 0. 


Exercise 1b. In Theorem 1A, one may weaken the hypothesis on F. Rather 
than supposing that F is irreducible over Q, assume that at least three of the complex 
numbers aj,... ,@q in (1.2) are distinct. 

The methods of Thue, Siegel, and Roth are “ineffective” in the sense that they 
don’t yield a bound A = A(F,m) such that any solution (z, y) satisfies 


max(|z|, |yl) £ A. 


Alan Baker (1967) remedied this situation. 
Thue’s method, however, can be used to give some upper bound on the number of 
possible solutions. Lewis and Mahler (1961) gave a bound 


B = B(d,m, H), 
where H = , max {al For many years, it had been conjectured that a better bound 
L 


could be obtained. Siegel (1929) conjectured that there should be some B = B(d, m) 
independent of H. In subsequent work, he proved this for some cases. Evertse (1983) 
proved the conjecture in his thesis, obtaining the bound 


PURIFY? 4g x ASOT 
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where v is the number of distinct prime factors of m. More recently, Bombieri and 
Schmidt gave the following result. 


THEOREM 1B. (Bombieri and Schmidt, 1987). Let F and m be given. The 


number of primitive solutions to the diophantine equation 
F(z,y)=m 


is not greater than 
Co dt, 
where co is an absolute constant, v is the number of distinct prime factors of m, and d 
is the degree of F. 
Further advances on the number of solutions were made in recent work of Stewart 
(to appear). 


THEOREM 1C. Let F and m be given. The number of solutions of the “Thue 
inequality” 
|F(2,y)| £ m 


< dm?/4(1 + log m°). 


In Theorem 1B, consider the case m = 1. Then we have F(z,y) = 1 and v = 0. 

So the number of solutions is < d. This bound is not bad, as is seen by the following 
example. Consider 

z? + e(z — y)(2z — y) (dr —y) = 1. (1.5) 


This equation has at least d solutions, namely (1,1), (1,2),... ,(1,d). In fact, if d is 
even, we have 2d solutions. Hence we see that < d is best possible. Note that when c 
is e.g., a prime, then the form in (1.5) is irreducible over Q by Eisenstein’s criterion. 

Now let m be arbitrary, say m = pj! ---p?". Then pı --- py Ê m and log pı +log p2 + 
-+++4+log py Í logm. If we have pı < pz <--- < py, then p; 2 i+1 (i= 1,...,v). Thus 
log 2 + log3 +--- + log(v + 1) £ logm, and for m sufficiently large, 


vlogyv Ŝ (1 +e)logm. 
Then, again for € > 0 and m sufficiently large, 


< logm 


= = — (1 í 
A TARTA tej 
So the number of solutions of F(z, y) = m is 

< dt” < mE 
d,e 
as m — oo. 


Conjecture. Given a form F as above, the number of solutions of F(z, y) = m is 


€ (log m)*, 
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where c is an absolute constant. Perhaps this is valid for c = 1, or at least for every 
c>l. 
Now we will look at Theorem 1C. Consider the set S(m) of (x,y) € R? with 


|F(z,y)| £ m. 


Then S(m) = m!/4S(1). Suppose F(x,y) = ao(z — ary): -- (z — aay), and say, for 
simplicity, that the a; are real. Then S(1) contains the lines z = ajy as in the following 
illustration. 


x = ay 


/(1) x = ay 


x = ay 


In Theorem 1C, we are counting the integer points in S(m). Intuitively, the number 
should be approximately the area of S(m). So if Ap = area(S(1)), the number of 
solutions should be about m?/4 Ap. 


Remark. Mahler (1934) has shown that the number of solutions of the Thue 
inequality is 
f re Apm?’ 


as m — oo. But the dependency of the error term on the coefficients of F was left 
unspecified. 


§2. Preliminaries. We may deal with a more general type of forms than the 
forms in 2 variables, 


F(X,Y) = ao(X - a®Y)--. (X — a®Y), 


of §1. Let L(X) be a linear form a1X; + --- + anXn with coefficients in a number 
field K of degree d. We will suppose that &1,... ,@n are linearly independent over 
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Q. fa — alt) (i = 1,... ,d) are the embeddings of K into C, then let L“(X) = 
al) X, ++ al Xn (i=1,...,d). A norm form is a form 
F(X) = F(X1,... , Xn) = aL™(X)--- L(x) 


where a € Q*. 
If K CC, put 
IL = Via +--+ lak 
The norm form F(X) has rational coefficients, so we may write F(X) = a'G(X) where 
a’ € Qand the coefficients of G are relatively prime integers. Then we have cont (F) = 


|a’|. We define Hx(L) = Hx (a1,... Qn). 


Exercise 2a. Let a be a root of at — 2a — 2 = 0 and K = Q(a). Compute the 
norm form 
Nix +ay+a?z)=a2t+---, 


LEMMA 2A. Given a norm form F(X) (as above), we have 
Jal[L|---|L| = (cont F)Hx(L). 


Proof. Write M(K) = M,.(K)U Mo(K), where M..(K) consists of Archimedean 
absolute values (i.e., v | co), and Mo of non-Archimedean absolute values. We have 


Hx(L)= JJ ie, 


veEM(K) 

where |L|, = |(ai,... ,@n)|». Therefore 

Hx(L) = Hrw(L)HKo(L) (2.1) 
where 

Hx;(L)= |] [Lo (J = 0,0). 
v€Mj(K) 

By (vi) of Chapter I, §6, 

Hxoo(L) = |L™|---|L |. (2.2) 


By (vii) of Chapter I, §6, we see that if a +> a (i = 1,... ,d) are the isomorphic 
embeddings of K into C, then 


Hxo(L)=Hewo(L?)= J| MO @=1,..-d). 
ve€Mo(K()) 


Therefore if E is the composition of K™),... , K€, then 


Heo(L)= [[ |LOG G=1,...¢) 
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where e = [E : K®] (i = 1,... ,d). Thus 


Hxo(L)* = II (JEP hw i 
we Mo(E) 


LL. le“ Fe 


we Mo(E) 


where in the last relation we used F = aL)... L(® and Gauss’ Lemma, and the 
notation that for any polynomial P, we write |P|, for the maximum of |c|, for the 
coefficients c of P. Since a7! F has rational coefficients, 


de 
Hxo(L)* =| |] lF 
PEMo(Q) 

Therefore 

Hxo(L) = jal - (Cont F). 
This, in conjunction with (2.1), (2.2), gives the assertion. 

Recall, a1,...,@n are linearly independent over Q. So we can extend them to a 

field basis a1,...,@n,... ,@¢ of K over Q. Then it is known that the matrix 


A 
~ 
wa. 
ILA 
a 
Sar] 


@?) as 
is non-singular. Hence the submatrix 
i). Gi saa a 


has rank n. 

Among the linear forms L“,... , L“, there is a set of n linearly independent 
ones. Let I be the collection of n-tuples (71,... jin) with 1 S$ 2; $ d for which the forms 
L)... LC») are linearly independent. The semi-discriminant of F is given by 


D(F) = |a¥/4 JT fdet(Z),... , 26), 
(41,... tn EI 


where |I| = card (J). It is easy to see that D(F) is independent of how we write F. Say 
F=aL®...L® =bM® .-...M®. 


By the essential uniqueness of factorization, we have after reordering that M® = A; L® 
with certain coefficients \1,... , Ag. Since J = 1(L™,...,L) = 1(M®,..., MM), 
it will suffice (for the independence) that each L“ occurs exactly |I|n/d times in a 
determinant of the product. All of the L together occur |J|n times, so if it is “fair”, 
each L() will occur |J|n/d times. Since there is an automorphism of E mapping LO?) 
into L™, (1 $i Ê d), each L™ does indeed occur the same number of times, i.e., |I|n/d 
times. 
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Clearly D(F) is nonzero. Because the product is fixed under o € Gal(E/Q), 
(where E is the composition of K“),... , K(®), and since by what we have just said 
the exponent |I|n/d is an integer, we may conclude that D(F) is rational. 

The coefficients of the L" lie in the field E. If v € Mo(E), then by Gauss’ Lemma, 


laly|L],---|L], = |F lo. 
If we suppose that F has integer coefficients, which we shall, then 
lala lL] ++ EP = [Fle $ 1. 
On the other hand, 
CCE a E [Leo Ey, 


We have noted that each conjugate L) occurs exactly |I|n/d times in the product for 
D. Then for v € Mo(E), we have 


|Dly = (lale L® o --- [L |p) 712 < 1. 


Since this is true for every non-Archimedean | |,, we have D € Z and |D| 2 1, where | | 
is the usual absolute value. 
If v is Archimedean, then 


|D] $ (Jall LP], --- |Z], )HIn/4 
by Hadamard’s inequality. So we have 
|D] $ (Jal|Z™|-.. (L24 
for the ordinary absolute value also. We may summarize our results as follows. 


LEMMA 2B. Suppose F is a norm form with coefficients in Z. Then D is a 
rational integer, D 4 0, and 
D(F) < H* (F), 


where H*(F) = (cont F)Hg(L) = |a||L™ --- ©]. 


Remark. Suppose n = 2 and F = a(X —a Y)---(X — al Y) is irreducible over 
Q. Then a 4 a) for i # j, and I consists of all pairs (i,j) with i # j and 1 Ê i, 
j £ d, so that |I| = d(d—1)/2. We may conclude that 


D(F) $ HF). 
Now suppose T : Z” — Z” is a linear map. Put FT(X) = F(TX). Then 
D(FT) = | det T|! D(F). (2.3) 
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Note that the set J introduced above depends on the ordering of L®,... , L(®., 
But its cardinality, which by abuse of notation we denote by |J(F)|, depends on F only. 
For T € GL(nonsingular T, Z), 


IKET) = IKF). 


Let p be a prime. Let To, T1,... , Tp be the linear maps given by the matrices 


_/p0 _f1 0 bn. 
n= (2 a ns A (J =1,...p). 


Suppose x = a lies in Z?. Then either z = 0 (mod p) or z # 0 (mod p). In the first 
case, £z = px, for some zı € Z and 


(JE 


On the other hand, if z # 0 (mod p), then y = jz (modp), or y = jx + py, with 


1 Š j Š p. In this case, 
z z 
=T; ; 
(5) i @ 


We have seen that any integer point x may be written as 
x = T;x' 


for some j, (0 $ j S$ p), and x’ € Z*. Further, if x is primitive, then so is x’. More 
generally, for n variables, put 


> 0 
1 O J P O 
1 f 
To = 1 i 1 >, AES Sp): 
O = Š 


i 1 


Again, every x € Z” may be written as x = T;x' for some j and some x’ € Z”. In 
symbols, 


Z" = v T;Z". 
j=0 


Suppose that we want to study the number of solutions of the diophantine equation 
F(x)=1. 
This number is not bigger than the sum of the numbers of solutions of 


F® (x) = F(T;x) =1 
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for j = 0,1,...,p. Since D(FT) = |detT|/ D(F), we know that D(F7) 2 pll. The 


following lemma is now obvious. 


LEMMA 2C. Let d be fixed and let € be a class of norm forms which is closed 
under nonsingular substitutions F : Z” — Z”. Let Ne be the maximum number of 
solutions of F(x) = 1 over all F in €. Let Ne (p) be the maximum number of solutions 
of F(x) = 1 over all F in € with D(F) 2 pl!!. Then 


Ne & (p+1)Ne (p). 


Remark 2D. The lemma can be modified in two ways. Rather than counting 
solutions of F(x) = 1, we may count solutions of F(x) € S, where S is a given set of 
integers. Secondly, the lemma remains true if instead of counting all integer points we 
count only primitive integer points. 

Now let us restrict to n = 2 and F(X,Y) as in Thue’s Theorem, and of degree d. 
Let Nr(m) be the number of integer solutions of the Thue inequality |F(z, y)| $ m and 
Pr(m) the number of primitive integer solutions of the same inequality. Let Pp(m) be 
the number of primitive integer solutions of the inequality 


m/2 < |F(z,u)| Ś m. (2.4) 
Furthermore, let N(m) = maxN F(m), and likewise for P(m) and P'(m). 


PROPOSITION 2E. Suppose F is a form as in Thue’s Theorem and 
D(F) 2 (50m?/4)2/1_ Then 


Ph(m) < d(1 +logm'4), 


where the implicit constant is absolute. 


Proposition 2E will be proven in sections 3, 4, and 5. We will use it now to deduce 
Theorem 1C. 


Proof. (of Theorem 1C). Pick a prime 
p Z 2500m?/4, (2.5) 


Then if D(F) 2 pll, the condition of Proposition 2E is true. Combining Remark 2D 
and the proposition, we get that 


P'(m) < (p + 1)d(1 + logm'/4) 
< dm?/4(1 + log m’), 
provided p is a prime chosen as small as possible with (2.5). Given m, pick u satisfying 


odu < m< odlu+1), 
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Then 
P(m) $ Pett) 


u+i : 
=) P2”) 
j=0 
utl s i 
<adY 274(1 + log 2’) 
j=0 
< d2?"(1 +u) 
< dm?/4(1 + log m?/$). 
Now we can count all the solutions of |F(z,y)| £ m. Given such a solution, say 
(x,y) = t(z',y') where (z',y’) is primitive and t > 0, we have 
[F(2',y')| $ m/t?, 


since F is homogeneous of degree d. Thus 


Nim) s > P (=) 


< dm?/4(1 + log m1/4), 


Before giving a proof of Proposition 2E, we show that we may place additional 
hypotheses on F by introducing an equivalence relation on forms. If F and G are forms 
of degree d, we say F ~ G if G = FT for some T € SL(2,Z), where SL(2,Z) is the group 
of one-to-one, linear maps from Z? onto itself. The number of solutions of the Thue 
equation F(z, y) = m, or the inequality |F(z,y)| £ m, depends only on the equivalence 
class of F. The same is also true for primitive solutions. 

Recall that we let Pp(m) denote the number of primitive solutions of the inequality 
(2.4). Suppose there is such a solution, say Xo = (£o, yo). Since Xo is primitive, there 
exist integers 21, y, E€ Z with 
To Yo 
Ti Yl 


Let G(X,Y) = F(aoX + 21Y, yoX +yiY). Then G ~ F and G(1,0) = F(z0, yo). This 
gives us 


=1. 


m/2? < |G(1,0)| Ê m. 
We may, therefore, restrict ourselves to forms F which are normalized in the sense that 
m/2% < |F(1,0)| $m. By writing the form as F(X,Y) = a(X — aY)... (X -a Y), 
this inequality becomes 
m/d* < |a| £ m. 


We will call F reduced if F is normalized and if H*(F) is minimal among all 
normalized forms which are equivalent to F. We have seen that Proposition 2E is 
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sufficient to get our bound on the number of solutions of the Thue inequality. Since D(F) 
and |I(F)| are invariant under substitutions from SL(2,Z), we may restrict ourselves to 
reduced forms F. In view of D(F) 2 (50m1/4)?!/I and Lemma 2B, we need only concern 
ourselves with reduced forms having 


H*(F) 2 50¢m. (2.6) 


We may also suppose, without loss of generality, that cont (F) = 1. For in general, 
if F = cÊ with cont (Ê) = 1, we replace F,m respectively by Ê = c7! F, ñ = 7m, so 
that (2.4) becomes 1/2? < |F(x)| £ ñ, and F is normalized (with respect to ñ) and 
reduced, and (2.6) changes into H*(F) 2 50m. 

What we need to prove, then, is the following 


PROPOSITION 2F. Let F be a form as in Thue’s Theorem. Suppose F is 
reduced, cont(F) = 1, and (2.6) holds. Then 


Pi.(m) < d(1 + log m'/*). (2.7) 


§3. More on the connection between Thue’s Equation and Diophantine 
Approximations. Throughout the next few sections, we will let F(X,Y) be a form of 
degree d as in Thue’s Theorem. We will write f(X) = F(X,1) = a(X — a).--(X — 

(4)) 
at”). 


LEMMA 3A. Suppose we are given (x,y) with y > 0. Let a be a root of f(X) 
with 


a-=|= min |a®-— z| (3.1) 
y 1sigd y 
Suppose 1 S$ u S$ d and f™(a) #0. Then 
1/u 
z|<d (Ss) 
a-—-{|s -(—— : 3.2 
AUG) E 


Proof. Let G denote any ordered subset of {1,2,... ,d} of cardinality u. How 
many such 6 are there? There are 


d(d—1)---(d—u+1) Ŝ d". 
Let fe (X) be defined by 
fe (X)=a J] (x-a%), 
ee 


Then we have 


F(X) =i fe (xX). 
6 


84 


Therefore, given our a, there is some such 6 with |f)(a)| $ d"|fe (a)l. This 6 will 
be fixed in the sequel. For any j, where 1 Ê j Í d, we have by (3.1) that 


lyla — a )| $ ya — z| + ya — z| $ Alya) — zl. 
Now fe (a) satisfies 
ly“ fe (a)| = lal [] Iya - o)| 
j€S 
$ 24 “{aj TJ lye — zl, 
J45 
and f(a) satisfies 
yo" f(a)| $ a¥a4™ lal TT lya — z. 
j€6 


Multiplication of both sides by |] jc lya“d) — z| gives 


II yo = 21 | ly?“ F(a) $ d*24™ | F(x, y), 
jES 
so that 

[ya — z|" lyt" f(a) $ d*24-"|F(z, y). 


The assertion (3.2) now easily follows. 
In what follows, we will apply the lemma for u = 1. For this we need a lower bound 


for | f'(a@)|. 


LEMMA 3B. Suppose f(X) € Z[X] is of degree d with cont(f) = 1, and is 
irreducible over Q. Let a be a root of f and K = Q(a). Then 


iy > IDE? 5 1 
[f'(a)| 2 x(a)? Z mE 


(where D(f) is the (classical) discriminant of f ). 
Proof. Write f(X) = a(X — a")).--(X — a(®) and suppose a = af). We have 
F(a) = ala — al)... (a ~ a) 
and 
et? (aa) D 


S$ i<j í 
(Fa = —— SS Se, 
atta Ufa ad G 


A 
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say, where G = at T]; < i<j < g(a — a )?. We need to estimate G. We have 
fox) -aD S Jeg)? + Jag)? + Qla aq} 
$ (1+ la P) + Ja |?) 
since 2\a ja) | < 1 + Jal |?|a)|?. Therefore, 


IGI Sjap a + fa PY + Jo |?) 


28 i<j Sd 


d 
n lalas—* Jfa a la®|2)4-2 


i=2 
$ (lal + aP -a + aAa. 
Let F(X,Y) = a(X —aMY)---(X — alMY) = aL (X)--- L((X). Then 
IG) £ (la L9- LONT = (hela) 
by Lemma 2A, since cont (F) = 1 and Hg(L™) = hx(a). The desired result now 


follows, since 
’ D D 
ror = [PP] PO. 


Combining these two results, with u = 1 in Lemma 3A, gives the following. 


LEMMA 3C. Let F and f be as above. Suppose 
|F(z,y)| £ m. 


If, as before, y > 0 and a is among a),... ,al® with 


a= = min at — =| 
l1sgigd y 
then 
z| < d2hg(a)t?m 
y|~ 2 y’ l 


This result is due to Lewis and Mahler (1961). 
We now have some reasonably nice values for the constant in (1.4). 


§4. Large Solutions. Recall, we are estimating Pp(m), i.e., the number of 
primitive solutions of (2.4): 


m/2? < |F(x)| £ m. 


The hypothesis on F in Proposition 2F is that F is reduced with cont(F) = 1 and 
H*(F) 2 504m. 
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If y = 0, we have the primitive points (1,0) and (-1,0). By putting another factor 
in our < inequality, we may restrict our solutions to those with y > 0. We will let 
“large solutions” be those with 


y2 H*(F). (4.1) 


Write F(X,Y) = a(X —aMY)...(X — a(MY) and L = X — aY. Then H*(F) = 
(cont F)Hx(L) = hg(a) = hx(a“)). Therefore, “large solutions” satisfy y > (hx (a“)))® 


(i= 1,...,d). By Lemma 3C, some root a satisfies 
d d-2 
ee < 2 hkl) m 
y| 2 y? 

Since hg(a) 2 50¢m, we have 

z| _ hx(a)4 
a e-z] eo 

y y 


with plenty to spare. Since d 2 3, we have 


Ta ba Wat a co 
tds dt az de VE: 
d+zd>zd+ 725 + 5Vd 


_l 
~ 8 


Then (4.2) for large solutions yields 


d 


ae hx(a)? 1 


=| = yt/8y3 Vad /2 y3 Vd /2 ` (4:3) 


As an application of Theorem 9A of Chapter II, we saw that the number of solutions 
of (4.3) with y > h(a) is < 1. Here we have y > hx(a)® 2 h(a). Thus the number of 
large solutions for the given root a is < 1. Then the total number of large solutions is 
< d. We state this in the following lemma. 


LEMMA 4A. Let F be a form of degree d as in Thue’s Theorem. Suppose 
cont(F) = 1 and H*(F) 2 50¢m. Then the number of primitive solutions of (2.4) with 
y Z H*(F) is «d. 

Thus when it comes to large solutions, the log factor in (2.7) is unnecessary. 


§5. Small Solutions. We let small solutions be those with y < H*(F)*. By 
Lemma 2F, we may assume that F is reduced (and therefore normalized). We will also 
assume that 

H*(F) 2 50¢m. (5.1) 


LEMMA 5A. Suppose G(X,Y) = b)(X — BYY)---(X — BOY) is normalized 
and is equivalent to F. Let £ € Z and let 


n=|@-@4+1 (Sika). 
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Then m ++- na 2 Q, where 
Q = H*(F)/m. (5.2) 


Proof. Let : 
G(X, Y) = G(X + Ly, Y) 


d 
= bo |] (x - (8 - OY). 


Then G ~ G ~ F and G is normalized since G is normalized. Furthermore, since F is 
reduced, we have 


H*(F) $ H*(G) 


d 
= [bol T] 1 + 160 - e}? 
i=1 


mm `: Nd, 


IA 


since |bo| £ m. 


LEMMA 5B. Suppose F is as above and x,Xo are primitive solutions of our 
inequality 
m/2? < |F(x)| £ m. (5.3) 


Then there are numbers y,,... , pa (depending on X, Xo, F, m) such that the following 
conditions are satisfied: 


(i) for each i, either p; = 0 or 1/2d $ 4; £1, 
(ii) D Z 1/2, 
G ton Goch, 
IEexa)/Li(x)] 2 (Q* — F) Do, 
where L;(x) = ay — z and 


To Yo 
t y 


D(xo,x) = 


Proof. Since x is a primitive integer point, we may pick x’ € Z? with D(x’,x) = 1. 
Then x,x’ is a basis for Z?. Therefore, 


Xo =rx+sx’ with r,s €Z. 
¿From linear algebra, we have D(xo,x) = sD(x',x) = s. Then we have 


Xo = rx + D(xo,x)x'. 
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L;(xo) = 
L(x) 
where 8; = —L;(x')/L;(x). Set 


L,(x') 
Lj(x) 


r+ D(xo0,x) 


G(V,W) = F(Vx + Wx’) 
= a |] Li(Vx + Wx’) 
F: 
= a | [(VLi(x) + WL;(x')) 


i=1 
d 
= bo [v _ BiW), 
i=1 


where bo = F(x). So G is normalized and G ~ F. 
Since x, x9 are solutions to our original inequality, we have 


0 < |F(xo)/F(x)| < 2%, 
or equivalently, 


d 
0 < [] [Li(xo)/Li(x)| < 2%. 
i=1 
Then at least one factor is no greater than 2, say 


|La(xo)/La(x)| £ 2. 


Then we have 
|r F D(xo,x)Ba| = 2. 


Putting 8 = Re p4 gives 
|r ~ D(xo,x)B] $ 2. 


Let £ € Z with |8 — 4| $ 1/2. We have 
|Li(xo)/L:(x)| = K8 — Bi)D(xo,x) +r — D(xo, x) | 
2 J2 — Bil|D(xo0,x)| — 2 
1 
2 (l -— Bil - 5 ~ 2)ID(xo,x)| 
7 
= (ni = z)lP(x0,x)|, 
if n; = |l — 8;| + 1. For each 2, put 
Q if n2Q, 
m= im if QY Sni <Q, 
1 if n< get. 


=r—D(xo,x)$i, (¢=1,... 
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Define y; by n! = Q% . By Lemma 5A, we have m1 ++: na Z Q. Therefore, ni ---' 2 Qu?, 
d 


so that Yvi 2 1/2. Furthermore, n; 2 n! = Q*', which gives 


i=l 
> Yi T 
|L:(xo)/Li(x)| = (Q" — 5 } PŒ x)| 
as desired. 
In applications of Lemma 5B, we will take xọ = (1,0). This is a solution since F is 
normalized. In this case, L;(xo) = —1 and D(xo,x) = y, so the lemma gives Y%1,... , Ya 
satisfying 


1/|Li(x)| 2 (% -i)y (i=1,...,d). 


Now suppose that p; 2 1/2d. Then Q* 2 V50 > 7 by (5.1), (5.2), and 


1/|Li(x)| 2 Q*y/2 2 Q*/?y (i=1,...,d). 
That is, 
LQ E 1/Q* y (=1,...,d) 
and 
< 1/Q* 2y? (i=1,... ,d), 


aZ 
y 


We state this in the following lemma. 


LEMMA 5C. Suppose F. is a form as above (normalized and reduced) and x = 
(x,y) is a primitive solution of our inequality, i.e. 


m/2* < |F(z,y)| Sm (5.3) 


with y > 0. Then there exist numbers p; = i(x) (i =1,...d) as above with 


al) — z| : (5.4) 


for every i in 1 Í i d with p; > 0. 
Compare this with Lemma 3C which says that for some a‘), we have 


For small values of y, Lemma 5C is better than 3C, because there is no constant in the 
numerator in (5.4). 
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LEMMA 5D. Let & be the set of primitive solutions of our inequality (5.3) with 
1 Š y S Y = H*(F}. Then for any i, 1 Ê i d, we have 


log¥ 
Y vi(x) «14+ E. 
eer logQ 


Proof. Given i, let x1,... ,x, be the elements of X with ,(x) > 0, ordered such 
that yı S y2 $ + Sw SY. Then 


l < |Zj _Zj+i 
YjYj+1 Yj  Yj+ı 
< ja®— Fj + ao —Zi+i 
~ Yj Yj+1 
3 1 1 
Ques t Query, 
| 1 


= Shion Fh ae 
qr 2y?  2yjYj+ 


The last inequality follows from Q¥i(’*")/2 > Q1/44 > 501/4 > 2 and yj+1 2 yj. Now 


we have 
1 Z 1 


2yjyjyi 7 Q} C)/2y2? 
which gives 
vir 2 QMO Py, R: a y,, 


(This relationship between y; and y;+, can be thought of as a “variable gap principle”.) 
Since we have y,/y1 Í Y, this gap principle tells us that 


v~l 
II QY: (x;)/4 < Y, 
j=l 


and therefore 


v-l 
log Y 
i(X; $4 . 
2l is log Q 


Since ¥;(x,) Ê 1, the lemma follows. 
We are now able to estimate the cardinality of ¥. ¿From Lemmas 5B (ii) and 5D, 
we have 


_ log H*(F)® 
=d (1+ ee) x 
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Therefore, on using (5.1), (5.2), we obtain 


logm 
|X| <a(1+ cn) 


<a(i+ em) 


= d(1 + logm'/¢). 


We have proven, then, the Proposition, and therefore the main result (Theorem 
1C), which said that the number of solutions of the Thue inequality |F(x)| £ m is 


< dm?/4(1 + log m'/4). 
In particular, the number of solutions of F(x) = 1 is < d. 


Remark. One might believe that the number of solutions could be estimated in 
terms of the number of real a(®’s, rather than d. This is not the case, as shown by the 
following. 


Example. Let F(X,Y) = X? +e(X —Y)*(2X -Y)?--- (5x —Y)*, where c > 0, 
and d is even. Then F(z,y) = 0 for real x,y would imply that z = y = 0. So 
F(z,1) has no real root. However, F(z,y) = 1 has d non-trivial integer solutions, 


namely +(1,1),+(1,2),... ,4(1, 4). This example was communicated to me by M. 
Waldschmidt. 


§6. How to Go from F(x) = 1 to F(x) = m. We now know that the equality 
F(x) = 1 has < d solutions. We want to extend our results to Theorem 1B, which 
states that F(x) = m has < d!*” solutions, where v = v(m) is the number of distinct 
prime factors of m. 

To consider this problem, we go to a wider setting. Consider forms F(X1,... ,Xn) € 
Z[X1,--- ,Xn] which are decomposable, i.e., F = Lı --- La where L1,... ,Lq are linear 
forms in n variables with complex coefficients. 


Exercise 6a. Let F be decomposable as above. Show that F can be written as a 
product L,---Lq, where now the coefficients of each L; generate an algebraic number 
field K; of degree [K; : Q] $ d. 

For given n and d, let € be a class of decomposable forms which is closed in the 
following sense. If F € € and G = cFT with c a non-zero rational, T € GL(n,Z), and 
GeZ[Xi,...,Xn], then G € € also. 


Example. Suppose n = 2 and € the class of forms of given degree d 2 3 and 
irreducible over Q, as in Thue’s equation. Then this class is closed, since irreducibility 
is unchanged under a linear transformation. 


Example. Suppose F = aL") ..- L{ is a norm form where any n among L"),... , 
L( are linearly independent. Then this class is also closed. 
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For a given class €, let Me (1) be the maximum number of solutions of F(x) = 1 
with x € Z", where the maximum is over all F € €. Let Me (m) be the maximum 
number of primitive solution of F(x) = m over all F € €. It is trivial that Me (m) = 
Me (—m). 


THEOREM 6A. If Me (1) is finite, then 


Me (m) $ F i L) daaatm®) Me (1), 


where v = v(m) is the number of distinct prime factors of m, and dy-1(z) is the number 
of ways of writing £ = 21 22+--Zp-1 with z; € N, (i =1,...,n—1). 

This result has its genesis in Bombieri and Schmidt (1987), where the case n = 2 
was shown. 


Remark. The function d(x) is the ordinary divisor function, while d,(z) = 1. 
In the special case n = 2, we have 


Me (m) $ d’Me (1). 
Furthermore, for the Thue equation, we know that Me (1) < d, so that 
Me (m) « dH, 
and Theorem 1B is proved. 


Exercise 6b. Given n and d, let 


am) (05) datt 


Show that g(m) is multiplicative, i.e., g(mım2) = g(mı)g(m2) if gcd (mı, m2) = 1. 


Exercise 6c. Given n,d and € > 0, and for g(m) as above, show that g(m) < më 
n,e 


as m — oo. 
Because of the multiplicativity of g(m), it will suffice to prove the following. 


PROPOSITION 6B. For k > 0 and a prime power p" where u 2 1 and p | k, we 
have 


Me (be") $ (Ê ,) daio") Me (H) 


In what follows, suppose that E is a field and | | is a non-Archimedean absolute value 
on E, sometimes called an ultra-metric absolute value. For x = (z1,... ,£n) € E”, put 
|x| = max(|z1|,... ,|zn|). Similarly, if P(X1,... , Xn) is a polynomial with coefficients 
in E, then put |P| = max |c|, over all coefficients c of P. 
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LEMMA 6C. Suppose such a field E is algebraically closed. Then 
= |P]. 
max |P(x)| = [P| 


Ix{ $2 


Proof. It is clear from the properties of a non-Archimedean absolute value that 


< 
max |P(x)| $ |PI. 


|x| £1 


Thus it will suffice to show that there is an x € E” with |x| = 1 and |P(x)| = |P|. 

Consider the case n = 1. We may suppose that P # 0. Then write P(X) = 
P\(X) + P(X), where P, is the sum of the monomials of P whose coefficients have 
norm |P|, and P, is the sum of the remaining monomials. Suppose that 


P(X) =c X 4 ci X? +--+ 405, X" 
with 141 < <- <i. Choose z € E with 
P,(z) = cj, 2**. 


Then |z] = 1. For if |z| > 1, the summand c¢;,2**+! would dominate, and if |z] < 1, the 
summand ¢;,z"! would dominate. We have 


|Pr(z)| = lei] = |P] 


and 
|Po(x)| < |P]. 


Therefore, by the isosceles triangle principle for non-Archimedean absolute values, we 
have 


[P(z)| = |P}, 


as desired. 
The general case follows by induction on n. 


Remark. The hypothesis of algebraic closure was really necessary, as illustrated 
by the following example. 


Example. Let E be the field of rational numbers with | | = | |p, the p-adic absolute 
value associated with some prime p. Let P = X? — X. Then |P| = 1. Suppose z € Q 
and |z| £ 1. Write z = u/v with u,v € Z and p | v. Then 


p U\P u uP — uv?! 
z —r={(-) —{-—}) = —. 
v v uP 


The numerator satisfies u” — uv?—! = u? — u = 0 (mod p). Therefore, |x? — z| $ 1/p 


and i 
Se 
max |P(z)| = > 


{2{S1 
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LEMMA 6D. Let E be a field, not necessarily algebraically closed, with a non- 
Archimedean absolute value | |. Let L,(X),... ,La(X) be linearly dependent linear 


forms with coefficients in E. Given non-negative real numbers A,,... , 1, let R be the 
set of x with components in E and satisfying 
|Z ;(x)| S Ae (=1,...,t). (6.1) 


Then & may be defined by t — 1 of these inequalities, i.e., there exists an io, 1 S io Ŝ t, 


such that & is defined by 


|Zi(x){ í ài (i=1,...,i0— l,o +1,... t). 


Remark. This does not hold in the Archimedean case. Consider, for example, the 
field of real numbers R with the usual Archimedean absolute value and the system of 
inequalities 

lei] £ 1, |z2| $1, lar + 22| £ 3/2. 


The corresponding linear forms are clearly linearly dependent. However, the solution 
set (shown below) cannot be defined with less than three inequalities. 


Ky 


Proof. Since L,,... , L+ are linearly dependent, there is a non-trivial relation 
nly +--+ yili =9, 


with y; € E (i =1,... ,t). Furthermore, not all of the y;’s are zero, say yı #0,... , ye £ 
0, but ye+ı =- = ye = 0. Without loss of generality, we have 


[ilà = max (|y lài,- , |yelAe). 
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Now take ¿o = 1. If (6.1) holds for i = 2,... ,t, then the linear relation gives 


ly L1(x)| $ max(|y2||Lo(x)|,.-- , lyel|Ze(x)]) 
max(|y2|A2,--- , /yelAz) 
lala, 


HA WA WA 


so that |Z;(x){ £ 1, and the proof is complete. 


LEMMA 6E. Let E and | | be as above. Let Lı,... ,Ln ben linear forms in n 
variables with coefficients in E. Let non-negative \1,... , Àn be given. Suppose there is 
an x' € E” with 

[x'|=1, L(x] SA; (¢=1,...,n). 


Then there exists an ig, 1 Ê ig Í n, such that any x satisfying 
|x| í 1, |Li(x)| < Ài (i= Tox. „to — l, io + 1,... n), (6.2) 


has in fact 
LSN (GHA, yn). 


Proof. By the preceding lemma, we may suppose that [,...,L, are linearly 
independent. Then each coordinate X; may be written as a linear combination of the 
linear forms Zy,... , Ln. That is, 


X; = ya li(X) +--+ + YinDn(X), (i =1,... n), 
with yj; € E for 1 Š i, j £ n. Without loss of generality, 


[ilà = eee lyig|A;- 
The given x’ has 
EA = are byaglAj Š Iyut, 


and since |x'| = 1, this gives us 1 $ |y11]À1. In particular, y11 # 0. 
Now we will show that io = 1 works. Suppose x satisfies (6.2). Then the linear 
expression for the first coordinate gives 


hull) £ max(|y12{|Lo(x)|,--- , lyinl|Zn(x)], |z) 
max(|712|A2,--- , [YinlAn, lz1l) 
[ilà 


WA WA IA 


since |z,| £ |x| £ 1S |yi|Ai. Dividing by |yıı| in the inequality gives |Z1(x)| £ `, 
and the lemma is proved. 


LEMMA 6F. For d È n, let Lı(X),... ,La(X) be d linear forms in n variables 


with coefficients in E. Let non-negative real numbers \;,...,q be given. Suppose 
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there is an x' € E” with 
Ix'|=1, E(x SA: @=1,... ,d). 


Then there are n — 1 among these forms, say L;,,...,L;,_,, such that any x € E” 
satisfying 
|x| £ 1, |L; (x)| = Ài; (j =1,... »2— 1) 


also has 
L(x) SA:  (î=1,...,d). 


Proof. Apply Lemma 6D repeatedly, d — n times, to reduce the system to n linear 
forms. Then apply Lemma 6E to reduce further to n — 1 forms. 

For the following exercises, suppose that E is a field with a non-Archimedean 
absolute value | |, as before. Furthermore, suppose that E is complete, i.e., any Cauchy 
sequence has a limit in E with respect to the absolute value. A set & in E” is called 
symmetric, conver if Àx + py E R for every x,y € A and |Aj $ 1, |u| £ 1. Given 
X1,...,Xn € R, their convex hull is the set of all points Ayx1 +--+ + ÀnXn with 
|A;| £ 1, (i =1,...,n). This is simply the smallest convex, symmetric set containing 
Xij.. Xn. 


Exercise 6d. Suppose £ C E” is a symmetric, convex set, as above. Suppose als 
that & is sequentially compact (i.e., every sequence in & has a convergent subsequence) 
and that & contains n linearly independent points. Show that & is the convex hull of 
certain n linearly independent points. 


Exercise 6e. Suppose & is as in Exercise 6d. Show that there are n linearly 
independent linear forms [;,... , LZ, with coefficients in E such that & may be defined 
by the inequalities 

LI  @=1,..- n). 


Returning to Proposition 6B, we want to consider the number of primitive solutions 
of 
F(x) = kp*, where p|k. (6.3) 


By a previous exercise, we write F(X) = L,(X)--- La(X), where the coefficients of L; 
lie in a field K;. Let E be the algebraic closure of Q and let | | on E be some extension of 
the p-adic absolute value | |p. With any primitive solution x’ of (6.3), we will associate 
real numbers A;,... , Àq such that 


[Eix ed ES tid): 


Since x’ is primitive, we have |x’|, = 1. Therefore, by Lemma 6F, given such an x’, we 
can pick n — 1 of the forms, say L;,,...,L;,_,, such that any x € E with 


|x| = 1, |L; Œ) = Ài; G = Lye ‘n= 1) 


also satisfies 
|Li(x)| SA; G= leas): 
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With each x’, we associate (Li,,... ,Di,_.5Ai,,--- »Ain_,)) Which will be called the 
anchor of x'. Initially, we will count the number of solutions with a given anchor. 

Without loss of generality, consider solutions with the anchor (Iy,... , Ln-1; 
Ai)... ,An-1). Such an anchor is associated with x’ and A1,... , Aq satisfying 


Ay Ag = Lax’) La(x')| = |kp*| = p™. 
Suppose some x € E” is a solution to the inequalities 
Ix] $1 and L(x) $A; (@=1,...,n—1). (6.4) 
By Lemma 6F, such an x also satisfies 
L| Là; (é=1,...,d). 


Thus 
|F(x)| = [La (x) La(x)| £ `i ee Aa =p. 


Now, the points x € Z” with 
|Li(x)| £ Ài (i=1,...,n— 1) 


make up a lattice. Choose some basis for this lattice, say aj,... ,an. Consider x = 
yar +++: + Ynan where y; € E and ly;| $1 (i = 1,...,n). For such an x, we have 
(6.4), namely 

[x] $1 and |L,(x)| SA; @=1,...,n-1), 


therefore 
|F(x)| £ p™™ 


as above. Furthermore, if x € Z” belongs to our anchor, then x is in this lattice, i.e. 
xX = yay +++++Ynan. (6.5) 


We introduce another form G by G(Yi,...,Y¥n) = F(Yiai +- + Ynan). If x 
satisfies F(x) = kp“ and (6.5), then G(y) = kp", where y = (y1,..- , Yn) from (6.5). 
But for any y € E” with |y| $ 1, we have |G(y)| £ p~*. Hence, |G| £ p~*, by Lemma 
6A. We know, then, that every coefficient of G is divisible by p*. Set G(Y) = p*R(Y) 
with R(Y) € Z[Y]. Then G(y) = kp” becomes R(y) = k. Furthermore, R € €, so the 
number of solutions to R(y) = k is not greater than Me (k). 

We may conclude, then, that the number of solutions with a given anchor is 
not greater than Me (k). It remains for us to estimate the number of possible an- 
chors (Lis... , Li,_,;Ai,;--- ,Ai,-,)» The number of choices for (n — 1)-tuples, i.e., 
(41, sae ini) is 
Ain-1? Without loss of generality, we may consider the number of anchors of the form 
(Ii,... ,Ln-1341,... ,An—1). To answer our question, it is necessary to count the num- 
ber of possibilities for \1,... ,An-1- 

We begin by making several observations. 


ncı Je Given i1,--- ,%n-1, how many possibilities exist for Ai,- , 


(i) 


(ii) 


(iii) 
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By Gauss’ Lemma, we have |Z,|---|Za| = |F| £ 1. Suppose A1,... , An—1 belongs 
to an anchor coming from 41,... , Ag. Then 3 ---Aqg = p™*. Putting pi = Ai/|Lil, 
we have 


and 

Hitha 2 po”. 
If \1,...,An—-1 and Xj,..., àh; both occur in anchors and if A; < A; (i = 
1,...,n—1), then 4; = à; (i = 1,... ,n— 1). For suppose \1,... , An—1 comes from 
Ai,...,Aq and A4,...,A),_, comes from Aj,...A/,. Then there must be x and x’ 
with |x| = |x'] = 1, |Zi(x)] = à;, and |Z;(x')| = à. Therefore 

Ix] =1 and |Z,(x)| à; =]... ,n=1). 

Since \4,... ,A,—, belong to an anchor, we have 


EAE (i=1,...,d), 


and thus, 

AIN ((=1,...,d). 
However, 

Ài --- Àd =Mi dN =p"; 
therefore 


A= (tj =1,...,d). 
Recall that, for each 2, the field K; is an algebraic extension of Q of degree < d. 
The p-adic absolute value | |, on Q has the value set {p™ : m € Z}. Since the 
absolute value | |, on K; is an extension of | |p, we know that it has the value set 
{p™/e: : m € Z}, where e; = e;(K;) and 1 S e; Ê deg K; < d. (This e; is known as 
the ramification indez.) 
Given a primitive solution x, for each i we have A; = |L;(x)|. From observation 


(iii), A; is of the type p”/* for some m € Z. Then write 


Ài 


(i). Hence 


oe = p tile: 
i [Lily 
for some v; € Z. We have v; 2 0 (i = 1,... ,n—1) and py '--Hn-1 Z p`“ by observation 
bos ee Miele 
€i en- 


Suppose two different anchors have v1,... ,Yn-1 and v{,... vh; with v; S vi (i = 


1,.. 
N 


Vly. 


.,2—1). By applying observation (ii) to the corresponding \,,...,An—, and 
.. ,AL_y, we find that vj = vi (i = 1,...,n—1). Therefore, vn—ı is unique once 
.. )Un—2 are given. 


It remains, then, to count the number of non-negative v1,... ,Un—2 with 
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Since e; Í d, this number is bounded by the number of non-negative v1,... ,Un—2 with 
vı +-+++Up—2 Í du, which in turn equals the number of non-negative v1,... , Un—1 with 
vı +: + Vn- = du. If one thinks of factoring p?” as p” -..p®»-1, it is obvious that 
the number of non-negative v1,... ,Vn—1 with v: +---+vp—1 = du is simply dn—1(p*"). 
Thus, the number of possible anchors is Ca) dn_—1(p*"), and the number of primitive 


solutions of F(x) = kp” is not greater than 


(pÈ 1) deste) Me (8). 


87. Thue Equations with Few Coefficients. 


Recall that the bound on the number of solutions to F(x,y) = m depends only on 
d (the degree of F), and on m. Siegel (1929) hypothesized that there ought to be a 
bound for the number of solutions of certain diophantine equations f(z,y) = 0 which 
depends only on the number of non-zero coefficients. 

We consider the special case of the Thue equations F(z,y) = m. For cubic Thue 
equations, the Siegel conjecture is incorrect. There is no bound independent of m. See 
Ch. IV, §8. 


Consider the simplest Thue equation, namely the binomial one 
ax? — by? =m, d>3. 


This equation had already been studied by Siegel, (1929) (1970), Domar (1954), Hyyrö 
(1964), Evertse (1982), and Mueller (1987). They achieved upper bounds B = B(m) 
independent of the degree d. Consider also the trinomial Thue equation, 


az? + brytt + cyt =m, d>3. 


In this case, Mueller and Schmidt (1987) obtained a bound B = B(m). 
One may also study the more general Thue equations where F is a form of degree 
d > 3 with s + 1 non-zero coefficients. That is, F may be written as 


Fina) y aati ™ (7.1) 


1=0 


with 0 = do < dı < ... < d; = d. It turned out to be just as easy to study the related 
“Thue inequality” 
|F(z,y)| Sm. (7.2) 


Schmidt (1987) obtained a bound on the number of solutions, namely 
< Vds m?/4(1 + log m'/*). 


Later, Mueller and Schmidt (1988) obtained a second bound, showing that the number 
of solutions of (7.2) is 
< 8s? m?/4(1 + log m1/4), 
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This bound is itself bounded in terms of m and s, i.e. the number of coefficients of 
F, but independently of the degree. It is easily seen that the term m?/¢ is needed, but 
the logarithmic term is probably unnecessary. The term s? should probably be s. 

In these Notes we will deal only with the binomial and trinomial case. We will 
prove (Mueller and Schmidt (1987)): 


THEOREM 7A. Suppose F is a binomial or trinomial of degree d 2 9. Then the 
number of solutions of the Thue Inequality (7.2) is 


<m’, 


If we combine this with Theorem 1C (to deal with 3 Ê d S$ 8), we see that for any 
binomial or trinomial Thue inequality, the number of solutions is 


< m?/4(1 + log m?/4), 


88. The Distribution of the Roots of Sparse Polynomials. 


By “sparse polynomial”, we mean polynomials with few coefficients (as compared 
to their degree). Given a binary form 


s 
F(2,y) = Y a;ztytt: 


i=0 


as in §7, we have a corresponding polynomial in one variable, f(z), determined by 
f(x) = F(a,1). That is, 

f(z) = ya (8.1) 

i=0 
with 0 = dọ < dı < ... < ds. In this section, we will study the distribution of the roots 
of such a sparse polynomial f(z). 
We start with the case s = 2, say 
f(z) = ao + aiz + azz®, 

with 0 = dy < dı < dz = d. First, suppose that a, is “large”, i.e. |ai| is large 
relative to |ao|, |a2| . The polynomial aj + az® has dı roots, all of which satisfy 
|z| = [ao /a1|!/%. Then it turns out that the original polynomial f(z) has dı roots with 
|z| = jao/a1|!/%. Similarly, the polynomial az% + a2z4? has dz — dı nonzero roots 
satisfying |z| = |aı/az2|!/(%74%), and f(z) has dz — dı roots with |z| © |a;/ag|!/(42-4), 
On the other hand, suppose that a, is “small” relative to ag and a2. Then the polynomial 
ao + azz% has dz roots with |z| œ |ao/a2|!/4?. In this case, f(z) has all dz = d roots 
with |z| ~ |ao/aq|!/22. 
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How may one remember this? Consider the three points 
Po = (0, — log |ao]), 
P, = (dy Pil log lay I), 
P, = (d2 == log ļa2|). 


In the first case, when |a| is large relative to |ao| and |az|, we have the following picture, 
with Pı lying below the line segment Pp P2. 


P, 


As discussed above, we have d; roots with 


log |z| = — en = slopePo Pi, 
and there are dz — dı roots with 
log |z| = — 1g laa| = log im] = slopeP, P2. 


dy — dı 


In the second case, when |a;| is small relative to |ag| and |a2|, the point P, lies on or 
above the line segment PoP», as illustrated below. 


Py 
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Here all d = dz roots satisfy 


log |az| — log |ao| 


7 = slopePy P2. 


log |z| = — 
Returning to the more general case, suppose we have a sparse polynomial f given 
by (8.1) with 0 = dp < dı < ... < ds = d. We call the points 


P; = (dj, — log |a;|), (i =0,...,3), 


the Newton points. The corresponding Newton polygon is defined to be the lower bound- 
ary of the convex hull of the Newton points. In other words, the polygon consists of 
points (x,y) in the convex hull such that (z,y — €) is not in the convex hull for £ > 0. 
For s = 5, we may have the following picture. 


P; = P, = Pia) 
Pio) = Po 


Ps = Pin) Py = Py) 


So that we may refer to each of the vertices of the Newton polygon, we label them 
Po = Pio Pia- Pie) = Ps, as shown above. For 1 $ u £ £, we also let o(u) 
denote the slope of the line segment P;(,_1) Pi(u). Since the Newton polygon is convex, 
o(u) is an increasing function of u. 

By studying the case s = 2, we noticed that there is some relationship between 
the roots of the polynomial f(z) and the Newton polygon of f. The following theorem 
illustrates this connection in the general case. 


THEOREM 8A. There exists a map from the set of roots of f(z) to the set of 
segments of the Newton polygon such that exactly djju) — di(u—1) roots correspond to 
the segment P,(y—1) Py), and these roots have 


log [z| — o(u)| < (2 log 3)s. 


Remark. In p- adic analysis, the Newton polygon is well-known. In that case, 
log |z| = o(u). See, e.g., Koblitz (1977), Ch. IV. The Theorem holds for all polynomials, 
sparse or not, but is probably more interesting in the sparse case. 

Consider the circular ring (or annulus) R, given by 


e7(u)—2As o(u)+2As 
’ 


< |z| <e 
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where 


À = log3. (8.2) 


Then the theorem says that the dj.) — djj,y—1) roots are associated with an annulus 
Ru, and in fact, these roots lie in Ra. One should be aware, however, that there may 
be some overlapping of the annuli R,. For this reason, we will call these annuli “large 
rings” and we will introduce another set of smaller annuli. 

For 1 Í u £ £, let S, denote the “small ring” given by 


er(uy—A < |z| < erta, 
Two of these small annuli, say S, and Su+1, will overlap precisely if 
o(u+1) <o(u) +2). 


In this case, the segments to the left and right of P,,,) have similar slopes, so the angle 
at the vertex Py) is not very sharp. We will say that the vertex Piu) is “blunt”. On the 
other hand, those vertices P,) with o(u+1) 2 o(u)+2, will be referred to as “sharp”. 
By definition, we have that Pọ and P, are sharp vertices. As with vertices in general, 
we would like a notation which allows us to refer to a particular sharp vertex. Suppose 
that we have sharp vertices Piqu) for u(0) = 0, u(1),... ,u(p)é. Setting ifk] = i(u(k)) 
for k = 0,... ,p, we have the sharp vertices Po = Pio, Pipp -++ o Pitp] = Pe- 

To eliminate the problem of overlap in the rings, we introduce a third set of annuli 
which we will call “medium rings”. For 1 Í k Í p, let Ly be the annulus given by 


er (E-DED-A < jej < eTA, 
Then Ly is the union of the small annuli S, with 
u(k—1)+1 ŚŜ u Í u(k). (8.3) 
The u in u(k — 1) < u < u(k) correspond to blunt vertices Pi(,), so that 
a(u +1) £ o(u)+ 2A (u(k-1)+1£u<u(k)). 


Thus, subsequent rings Sy, Sy41 in this range will overlap. Furthermore, by adding up 
the changes in the slopes of adjacent line segments, we have 


o(u(k)) £ o(u(k — 1) + 1) + 2A(u(k) — u(k — 1) —1) 
S$ o(u(k — 1) +1) + 2A(s — 1). 


Let Ly, be any medium annulus and z € L,. For any u satisfying the corresponding 


inequality (8.3), we have 


e7(u)—2As o(u)+2As 


<|zl<e 
In other words, the medium ring Lẹ is contained in the intersection of the large rings 
R, with u satisfying (8.3). 

Using the medium annuli Lp, (1 Š k $ p), we will see that it suffices to prove the 
following proposition in place of Theorem 8A. 
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PROPOSITION 8B. Exactly dijk) — dix—1) roots lie in Ly (1 $ k S p). 

Theorem 8A does follow. We simply map the dijz] — djx—1) roots in Zg to the 
rings R, with u satisfying (8.3), and we construct this mapping in such a way that 
dilu) — dj(u—1 roots correspond to Ru. It is easily seen that Proposition 8B follows from 
the following 


LEMMA 8C. 


(i) For0<k& p, precisely dig] roots lie in |z| < erek) FA, 
(ii) For 0 $ k <p, precisely d — dix) roots lie in |z| > rial oo ak) -A 


Proof. As before, let 
f(z) = 3 a;z" . 
1=0 


Given ø € R, put f*(z) = f(ez). Then the Newton points Py),... , Pie) of f are 
mapped into the Newton points Pros — Pin of f* by the linear map (z,y) ++ (2, y— 
ox). In fact, the sharp vertices of f are mapped into the sharp vertices of f*. This is 
easily seen by considering o*(u), i.e. the slope of the segment to the left of Pia) on the 
Newton polygon of f*. We see that o*(u) = o(u) —o. 

Given k corresponding to a sharp vertex Pij], we put 


o =o(u(k)) +A. 


Then o*(u(k)) = o(u(k)) —o = —\ and o*(u(k) +1) = o(u(k)+1)—o(u(k))—A Z 2A- 
A = A. We illustrate these facts in the following picture. 


+ slope = a(u(k) + 1) 


> 
Piw slope = -À A = slope 2 A 


Pay 


slope = o(u(k)) 


Newton polygon of f Newton polygon of f* 
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Then Pis] is the “lowest” point on the Newton polygon of f* for the chosen value of ø. 
Now write 7 
f*(z)= Yaj" ; 
i=0 


First, suppose that ¿ < i[k]. Then the slope of the line segment P? Pis) is Í — À, since 
the Newton polygon is convex. Computing this slope, we get 


log lays] —loglaz| < 


S =N 
dijk] — di T 


Thus 
laž] $ Jağyle 7> 


Since A = log 3, we have 
lat] $ laky| 3°94) for i< afk]. 
In a similar way, one gets 
laž] $ Jafyy| SNF for i > afk}. 
What does this tell us when k = p? In that case, z[k] = i[p] = s and we have 
lat] Š lat| 355 for i<s. 


This will show that all of the roots of f* lie in |z| < 1. To see this, suppose |z| 2 1. 


Then E 
IDIZ fel (la3| — laš-ıl—-.-— leal) 
i 1 
< d ),* Lo OS Se 
$ |z| (as \(1 379 ...)>0. 


Since all of the roots of f* lie in |z| < 1, we have that all the roots of f lie in |z| < e” = 
e7("(P)) +à, Thus Lemma 8C (i) is true for k = p. 
For 0 < k < p, consider instead the polynomial 


ilk] 
fo(z) = 5D ajz% 
i=0 
and the corresponding 
ifk] 
Fl) = flez) = > agt 
i=0 
with o given by (8.4). Applying our previous results to fọ, we know that all of the 
diy roots of fọ(z) lie in |z| < 1. We claim that exactly dj, roots of f*(z) are, in 
fact, in |z| < 1. Once this is proven, we will know that exactly djj,) roots of f(z) lie in 
|z| < e = erek, 
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Using Rouché’s Theorem to prove the claim, it will suffice to show that 


lfo(z) — F(z) < lfe(z)| for jz] =1. 


Then the polynomials fọ and ff will have the same number of roots in the disk |z| < 1. 
But for |z| = 1, we have 


IEO- Ife (2) - #2) 2 lefyl- SO lel 


ižilk] 
a 1 
> lawl- id 37537 ua) 
= 0. 


To prove part (ii), the lower bound for the zeros, put f(w) = wt F). Then the 
Newton polygon of f is obtained from the Newton polygon of f by a reflection through 
the line z = d/2. Under such a reflection, sharp vertices remain sharp and the bounds 
follow. 


Exercise 8a. Let f(z) be a polynomial with coefficients in a field E and | | a 
non-Archimedean absolute value. Define the Newton polygon as before. Suppose that 
f(z) has all of its roots in the field E. Then it is known (see, e.g., Koblitz (1977)) that 
one has a mapping from the roots to the segments of the Newton polygon such that 
roots corresponding to Pitu -1) Py) have log |z| = o(u), where o(u) is the slope of this 
segment. Now prove this statement for a trinomial. 


§9. The Angular Distribution of Roots. 


In §8, we studied the radial distribution of the roots of a sparse polynomial f(z). 
In this section, we will consider the angular distribution for binomials f(z) = azf + c 
and trinomials f(z) = az? +bz1 +c. For a binomial, the roots make up a regular d-gon, 
so that the angular distribution is completely regular. In what follows, A will denote a 
wedge with vertex 0, i.e., a region bounded by two rays emanating from 0. Part or all 
of the boundary of A may belong to A. Write |A| = ¢/27, where ¢ is the angle between 
the rays. We will consider the whole plane to be a wedge with |A| = 1. 


With this notation, we have the following result, which also holds (trivially) for 
b = 0, i.e., for binomials. 
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THEOREM 9A. Let f(z) be a trinomial of degree d. If Z(A) denotes the number 
of roots of f which lie in A, then 


lz) —djAl| <6. 


Proof. We may suppose that’ b is real since the roots of f(z) are not affected when 
we multiply the polynomial by a suitable constant. Put t = d — q, so that d=t+q, 


and write the equation as azt +cz~? = —b. We may assume, without loss of generality, 
that t 2 q so that t 2 d/2. 

Introducing the notation e(z) = e?'*, write a = |ale(a), c = |cle(y), and z = 
\zle(¢). Then 


Jal Jz|* e(t¢ + a) + lel |z|74 e(—g6 +) = =b . 


The imaginary part of the left-hand side must be zero, so we have 


lal |z|“ sin (2r(t¢ + a)) = [e] |2|* sin(2x(q¢ — 7)). 


The left-hand side of this equality vanishes for some Ço, hence precisely for Co + (m/2t), 
for m € Z. The right-hand side may also vanish for one of these values. By a change 
of notations we can suppose that it vanishes at the same value Ço. In that case, the 
right-hand side will vanish at Ço + (m'/2q) for m’ € Z. 

For the time being, we will require the additional hypothesis that gcd(t,q) = 1. 
Then m/2t = m'/2q is possible only when m/2t = m'/2q € $Z. Thus, for the values 


C= Cot(1/2t), Co+(2/2t),... ,Co+((t—1)/2t), Co+((t+1)/2t),... , Co +((2t-1)/2t), 


the left-hand side vanishes but the right-hand side does not. If arg(z) = ¢ and ¢ is 
one of the above values, then we see that z cannot be a root of f. The rays given by 
arg(z) = Ç, where ¢ is one of the above, are called “forbidden rays” because they contain 
no solutions. Furthermore, these rays are determined independently of b. 

Now let a # 0, c Æ 0 be fixed and denote the d roots of f as functions of b by 
zı(b),... ,za(b). One can arrange this such that the z;(b) are continuous functions of 
the real variable b. By the continuity of z;(b), we see that the values of z;(b) for various 
6 can not cross any forbidden ray. 


forbidden ray 
= 


. zı(b) 
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When b = 0, the roots of f form a regular d-gon and 


za) — d|Al| £ 2. 


Now let A be an angular domain bounded by two forbidden rays. We call such an A a 
“special angular domain”. In that case, the continuity of the z;(b) gives 


Z(A) —d|Al| < 2 


for any b. 
Now let A be an arbitrary angular domain. There exist special angular domains 
Ai, A such that 
A; CACA 


and 


|Az| — |41] $ 2/t $ 4/d. 


(We do allow the possibility that A; is empty.) For the arbitrary angular domain A, we 
have 


Z(A) Í Z(42) Š dļA2| + 2 Š dj A| + d(|A2| — |41|) + 2. 


Then 
Z(A) $ d|A| + 6, 


since |42| — |Ai| £ 4/d. The lower bound for Z(A) is proved similarly. 

Now we need only to remove the additional hypothesis that gcd(t, q) = 1. In general, 
we have gcd(t,q) = 6. Write d = 6d), t = 6t1, q = qı, so that ged(t,, q1) = 1. The roots 
of f are of the type z = w!/° where w is a root of the polynomial h(w) = aw"! +bw% +c. 
To each root w of h, there correspond 6 roots of f which form a regular 6-gon. For any 
angular domain A, we have 


ae 
|A] = T +5 
where m € Z and 0 S$ p <1, as illustrated. 
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We will count the number of roots z of f in the domain A by considering the roots of 


h. 
A domain of angle 1/6 in the z-plane corresponds to a complete circle in the w-plane 


z-plane 


Furthermore, every root w of h will give rise to a root z in each of the m domains 
of angle 1/6. Thus we get dım roots in the portion of A of angle m/6. 

Now consider the portion of A of angle 4/6 in the z-plane. This corresponds to a 
angular domain B in the w-plane with |B| = yp. 


es a 
4 


z-plane 


H 


Ñ 


w-plane 


If Z' denotes the number of roots of h in the domain B, then 
|Z’ — dip| $ 6 


by our previous work. Combining these results gives 


laa) - all =|dm+Z' — d(= + £) | 
= |dym — dım + Z: = dy p| 


= |Z’ — dip| < 6. 


This result may be be generalized to a polynomial with an arbitrary number of 
terms, say 


f(z) =} aie , 


i=0 
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as before. Khovansky (1981) showed in this case that 


£ k(s) 


[za — dļA] 


where k depends only on s. Khovansky obtains k(s) of the order of magnitude es, 
This is almost certainly larger than need be. His work was related to an open question 
which we will discuss below. First, we consider a special case as an exercise. 


Exercise 9a. Suppose f(z) is a polynomial with s + 1 terms as above. Show that 
f has not more than s positive, real roots. 

Now suppose we have two polynomials f(z,w), g(z,w), each containing no more 
than s+ 1 monomials. Furthermore, suppose f and g have only finitely many (complex) 
zeros in common. 


Conjecture. The polynomials f and g have not more than s? common real roots 
(z,w) in the first quadrant, i.e. z > 0, w > 0. 

Khovansky showed that the number of such roots is not greater than £(s), where (s) 
is some function similar to k(s) above. The reader may find an account of Khovansky’s 
work in Risler’s (1984/85) paper. 

Erdös and Turan (1950) gave another result on angular distribution. They consid- 
ered polynomials of the form f(z) = aaz? +... + aız + ao, where ap Æ 0,aq #0. In this 
case, ; 

lao| + lax] +... + laal) "7 


S 16 (aios 


If all the coefficients are close in absolute value, then we have the bound < (dlog d}!/? . 
We now return to trinomials. 


THEOREM 9B. Let f(z) be a trinomial of degree d with roots a1,... ,œa. Then 
there is a subset of these roots, say 6,,... ,6¢, where £ £ 32, such that for every real Ç, 
we have 

` _ 8.| < 10 : Z a 
pons -ilse d, Pii IC — ail- 

Remark. The numbers 32 and e!° are larger than need be. A similar result holds 
for polynomials with s + 1 terms, where 32 and e!° are replaced by constants which 
depend on s. See Schmidt (1987). 

Proof. In the case of a trinomial, the corresponding Newton polygon has either 
one or two segments. Hence the roots of a trinomial lie in one or two annuli of the type 


log |z| — ø| Š (2A)(2)=4A <5. 


Thus 


oe 


e7 < |z| < e7 #5, 


which we will write as 
Bı < |z| < Bo, 
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where B2/B, = e". 
Consider the following picture. 


The number of roots in each of these two angular domains is not greater than 
d|A|+6 = 8, since |A| = 2/d. So, there are at most 16 roots in the two angular domains 
together. Now suppose that a is a root in the annulus which is not in one of these 
angular domains. Suppose 6 is any root in the annulus. Let ¢ € R. If |¢| 2 2B2, then 
|\¢ -a| 2 Bz and 

IC - 6] 2 |¢-a|+|a—6| 
$ |¢ -a|+2Be 
£3 |- aļ < eld |¢ — al. 
On the other hand, if |¢| < 2B, then write a = pe(n), where |n| £ 1/2. First, say 
In| £ 1/4. Then |n| 2 1/d since a is not in the angular domains. We have 
IC — 6] $ ICI + [6] < 3B2. 
In this case, we also have 


-a| 2 |Ima| = |øsin(2rn)| 2 4pn 2 4Bı/d. 
Combining these two estimates, gives 
3, B 
K-61 S 3 dg Eels ede- al 

It is now clear that the theorem holds with 6,,...,6¢ the roots in the angular 
domains in each of the annuli, if there are such roots. If the angular domains in one 
of the annuli contains no root, but if there are roots in this annulus, then we pick one 
such root to be among 6,... ,5¢. Clearly, the Theorem holds with this choice. 


§10. On Trinomials. 


Initially, we will consider trinomials of the form g(z) = 2° + wz? +1. We will write 
M = |p| andd=q+t. 
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LEMMA 10A. 
( i) When M 2 344, then g has exactly t roots in the annulus 


WOT 374 ele) < M! 34, 
and exactly q roots in 
MES < |z|} < M" 34, 
( ii) When M < 3*4, then all d roots of g lie in the annulus 


374ld+1) < jz] < g4ld+1) ; 


Remark. In case (i), roots z in the first annulus have |z| > 1 and those in the 
second annulus have |z| < 1. In what follows, we will call these “large” roots and 
“small” roots, respectively. 


Proof. Consider the Newton polygon for g(z), illustrated below. 


(q, -log M) 


When M > 1, the Newton polygon has two segments, and when M £ 1 it has only 
one. 
In case (i), apply Theorem 8A to see that there are t roots corresponding to the 
second segment with 
log M 
t 


log |z| — Í 4log3. 


So we have 
M! 374 < jz] < M? 3" 


Similarly, there are q roots corresponding to the first segment which satisfy 


log |z| + DEM S 4log3. 
q 


Then 
M13 374 < |z| < M711 34 
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In case (ii), since M < 34¢, we may have either one or two segments on the Newtwon 
polygon, as mentioned above. The absolute value of the slope of any segment, however, 
is not greater than log M < (4log3)d. Using Theorem 8A once again, we know that all 
roots have 


—(d + 1)(4log 3) < log |z| < (d+ 1)(4log 3). 


That is, all d roots satisfy 
374+) < |z| < At) | 


We introduce the notation A > B to mean that A Z B/K?, where K is an absolute 
constant. 


LEMMA 10B. 
(i) When M 2 344, then every large root z has 


lo'(z)| > MCD 
and every small root z has 
lo") > MU 
(ii) When M < 344, then every root has either 
lg’(z)l>1 or g"(z)| > 1. 


Proof. (i). Since g(z) = z? + wz? +1, we have g'(z) = dz?! + quz?) = 
dz?! + duz% — tuz17! + g F 4 . If z is a root, then 


dzt-! + dz?! + gate) =0 


wT A oo: , 
z z 


and we have j 
W= |- tuz = S| 
1 
> q_ 
= (tM|z|? —d) . 


If z is a large root of g, then M1/t 374 < |z], so that 


1 
lg'(z)| 2 aT 


Furthermore, tM'+(4/t) 3-49 > M > 2d, so that we have 


(emi targ- = d) : 


Mi+(4/t) 
lg’(z)| 2 Daa oad Oi 
2-34.344. M! 
> MDH, 
If z is a small root of g, then 2 = 1/z is a large root of the reciprocal polynomial 


ĝ(z) = +24 + uzt +1. In that case, [Z| £ M19 34 and |ĝ'(2)| > M-79 . The 
original polynomial g(z) and its reciprocal polynomial g(z) are related by the equation 
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g(z) = z? ĝ(4). From the product rule and the fact that 9(2) = 0 for a root z of g, we 
get 
g'(z) = —24? g'(2) = —27-99'(2) . 
Then 
HOES MC-D/1 MD = M! | 


Now we look at case (ii). From the first part of the proof, we have 


d 
g'(2) = -tuz F > 


when z is a root of g. Starting with the original polynomial g and differentiating twice, 
we get 
g" (z) = d(d —1)24~? + palag — 12? 


= dd -1) Q + uala- 1) - dd- 1)? F 


d{d— 1) ` 


z2 


As previously, the first term vanishes if z is a root. In that case, 


„4-9 


z2 


g" (z) = w(a(q — 1) — d(d — 1)) 28 
Now consider the expression 
zg'(z)(d(d — 1) — q(q — 1)) — z°g"(2)t = Fd(d(d — 1) — g(4 — 1) — t(d — 1)) = Fdgt . 
This equation implies that either 
|zg'(2)\(d(d — 1) ~ g(q — 1))| Z dgt/2 


or 
|z°g"(z)t| 2 dgt/2 . 


In other words, either 
jzg'(z)|>1 or |z7g"(z)|>1. 
By Lemma 10A, we have |z| < 1, so that 
lg'(z)| > 1 or [g"(z)| > 1. 


The two preceding lemmas dealt with trinomials of the form g(z) = z? + wz? +1. 
In general, we have f(z) = azt + bz! + c € Z[z], where a,c > 0. Put 


u= batt et and M=|y]. 


Then f(z) = cg(w), where w = (a/c)!/fz. Now the various cases will depend on 
H = max (a, |b|, c). We will suppose that a, b, c are integers, so that in particu- 
lara 21, ¢2 1. 
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LEMMA 10C. If M 2 344 and H = c, then every root of f has 
oE a 


Proof. As we have seen, f(z) = cg((a/c)!/4z), so 


If'(2)| = e(a/c)/* |g'((a/c)'/4z)| . 


If z is a root of f, then (a/c)!/4z is a root of the special polynomial g, and by Lemma 
10B, we know that |g'((a/c)!/¢z)| > 1 . Therefore, 


Lf'(z)} > e(a/c)/4 = altt > H-0, 
sincea21. 
LEMMA 10D. IfM 2 344 and H = |b], then every large root of f has 


EORR EOR, 


p- (H GUD yg E 
a c : 


Proof. As before, we have f(z) = cg((a/c)!/4z) and 


|f'(z)| = e(a/c)'/4 |g'((a/c)'/*z)| 
> c(a/c)'/4 M(-)/ 


|bj@-3 1/t 
-(35) 2 


= P%qi/4 |b]! 72/9 ci/d 


where 


2 PY HCD, 


LEMMA 10E. Suppose M < 344 and a < c. Then every root of f has either 


Pe ED or f(a) OID 


Proof. Since M ~ 1, we have |b| < a9/¢ c'/4 < c and H < c. From the chain rule, 
with w = (a/c)!/4z, 


IFE = a? eD |g'(w)| 
and 

IFE = 07/4 tC lg" (w). 
By Lemma 10B, either 


[f'(z)| > allt c!70/8) y H}-Q/4) 
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or 


If" (z)| Se az/4 ci -@/4) = H}-(@/4)_ 


§11. Roots of f close to = 


We now return to the Thue inequality 
|F(z,y)| £ m, (11.1) 
where F is the homogeneous polynomial given by 
F(X,Y) =aX44bX'Y'4cY¥?, 


with a, c > 0. We will see that either there exists a root a of f(z) = F(z,1) which is 
close to 2, or there exists a root 2 of the reciprocal polynomial f(z) = F(1,z) which is 
close to #. We must distinguish several cases. 


LEMMA 11A. Let H = max (a,c). Suppose that M 2 344 and (z, y) is a solution 
to (11.1) with z #0, y #0. Then either there is a root a of f(z) = F(z,1) with 


mH(/4)-1 


a7 --— 
lt 


< 


or there is a root 8 of f(z) = F(1,z) with 


mH(@/4)-1 


< 
|x| 


y 
E 


Proof. We may suppose that H = c. In this case, we will see that the first 
alternative holds. By Lemma 3A, with the parameter u = 1, there is a root a of f with 


x 
Q&Q- — 
y 


y m 
Ifa) Iyl? 
Furthermore, by Lemma 10C, we have |f'(a)| > H!~@/® , so that 


z mH(Q/4)-1 


ly|¢ 


The second case follows similarly when H = a. 


LEMMA 11B. Suppose M 2 3*¢ and H = |b], and (x,y) is a solution to (11.1) 
with x £0, y #0. Then either there is a large root a of f(z) = F(z,1) with 


mg/d- 
llè 


z 
we 
y 


< 
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or there is a large root B of f(z) = F(1,z) with 


Heid- 
| B- 3 A ta 
z |x| 
Proof. The polynomial f(z) has t large roots, say ai,... , œt, and q small roots, 


say 1/f1,...,1/8,. Then the reciprocal polynomial f has large roots f1,... , 6q and 
small roots 1/ay,... ,1/az. Let 


L = min (|z — ayyl,... , |£ — aryl) 


and : 
L= min (ly — Aiz|,-.. |y — arl), 


and consider the two real numbers 


F ( a aa i ( é an 
lèl , |b] ` 


By symmetry, we may suppose that 


z ( à i <j ( 2 ~ 
|b| = dil l 


We will see that the first case follows. 
We have 


LS |y — Brz| = |6xl l£ — 6" yl (Íkíqą), 


By pe ATEA 
a (i) IBel le = Be” v (GŚk$ą). 


Now kp is a large root of f, so Bx is (a/c)'/4 times a large root of ĝ. Then we have 


so that 


L 


WA 


[Bx | < (a/o)? M™ = (|b/c)*/ 


and 


jol (1-1/d)/t |b] 1/qd i 
saaa a a 


a 


where P is as in Lemma 10D. 
By reordering, we may suppose that L = |z — a, y|. Then 


(a1 — as)y| £ |z — aryl + |z — azyl $ 2 |z — azyl (2 $ j $t). 
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From our work above, we also have 


(i=). 


S |z - ayl + |x — By" y| 


< Pix — By*y|+ |z — By! 
< P\z - By, less er) 


since P 2 1. Writing f as a product of its linear factors, differentiating, evaluating at 
z = a4, and multiplying by |z — a,y| |y?~"| gives 
( ) | 
ai= y Jy 
Br 


lz- ayl [f'(aa)y* "| = |2@-aryla J] la-a J] 


2258 1iksSq 
< P'a Il |z — a;y| II |z — By yl. 
1s3 st tSkSq 
But |F(z,y)| =a I] |z ~ a;y| I] |z — r 'y|, so that by (11.1) we have 
1sjt 1SkS$q 
7 x 5 mP! 
i= a ROA 
y| If) lvl? 


Furthermore, Lemma 10D gives a lower bound for the derivative, namely 
|f (a) > PAHC | 
Then 


as desired. 


LEMMA 11C. Suppose M < 3*1, and (z,y) is a solution to (11.1) with z # 
0, y #0. Ifa Sc, then there is a root a of f satisfying either 


HO/d)-1 
a- s < Ear I ; (11.2) 
ss 1/2 0/012) 
x mil* H g 


If c X a, then there is a root B of f with either 


mH(@/4)-1 
Fd 


p=) ~ 


T 


or 
m1/2H0/)-0/2 


|x|¢/2 


s-i- 


T 
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Proof. By symmetry, we may assume a Í c. Once again, apply Lemma 3A, this 
time with u = 1 and u = 2. We see that there is a root a of f with 


ee zl E 
y| |f'(@)| lyl4 


and with 
z m?/? 
y = E MYE 5 


By Lemma 10E, either |f'(a)| > H!~@/® or |f"(a)| > H!-@/%, which gives the two 
cases above, respectively. 

In the next section, we will complete the proof of Theorem 7A, which gives a bound 
on the number of solutions of the Thue inequality (11.1). For any solution pair (z, y), 
there may be a different a satisfying one of the results in this section. This could 
possibly introduce a factor of d. However, Theorem 9B allows us to restrict ourselves to 
as few as 32 roots a if we are willing to give up a factor which is < 1. So it remains for 
us to estimate the number of integers (z, y) satisfying one of the lemmas in this section 
for a fixed root a. 

Before doing so, we would like to combine our lemmas to get a single relation. 


Write 


a- 


K = mH@/9-1 (11.4) 
Then (11.3) in Lemma 11C becomes 
z K3/2 
an F < PEA 
and all of the other relations for a imply 
ea so, 
y| ly? 


If we restrict ourselves to solutions where min(|z|, |y|) Z K1/4, then we see that it will 
suffice to study 

K! {2 
~ Wl ` 


x 
y 


By symmetry, a similar result holds for roots 1/2. 


a- 


§12. Proof of Theorem 7A. 
Recall, in Theorem 7A, we give a bound on the number of solutions of 
|F(z,y)| Sm, 


where F(z,y) is an irreducible trinomial of degree d 2 9. As in the previous sections, 
we write 
F(a, y) = ax? + baty?* + cy? , 
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with a > 0, c>0. We let h = max (a, |b], c) and K =m/h}-@/9 , 


THEOREM 12A. 
(a) If F is a trinomial with d 2 9 and K < 1, then the number of primitive solutions 
of the Thue inequality above is 


+ + 
log (log m/log(2/K)) S48 log™ logm 


1 
oo log d logd 


(b) If F and d are as above and K > 1, then the number of primitive solutions to the 
Thue inequality with 
min (|z|, fy) > c, K9 


logt log m 


eted 


COROLLARY 12B. Ifm < H!~@/®, then the number of solutions is < 1. 
The corollary follows since m £ H!—(@/4) gives 1/K 2H'/4. Then 


1 
log m/log(2/K) « (log H) log H)=d. 
Remark. In fact, the corollary is still true when the 3 is replaced by any constant 
greater than 2. 


Proof (Theorem 12A). Let œ1,... ,aq be the roots of f(z) = F(z,1). By Lemma 
7B of Chapter I, h(a1)...h(aa) S 52/2 H(f) < H. In particular, each root œ has 
h(a) < H. Since d 2 9, we have 


ip _ fas fy 
LE AN eh a Oe 
V2d =e & 
with x = #/9/8. 


In case (a), in view of what we said at the end of the last section, we need to 
consider 


x y Ke 
a-— -7 
y@/2 
That is, look at 
d 1/2 1/2 
z c K K 1 
wmo a apa gT Gay 


if y > c3. The exponent in the last inequality is u = d/2x Z xv2d. Thus, by Theorem 
9A of Chapter II, the number of solutions with |y| 2 h(a) is « 1. 
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Next consider solutions of (12.1) with c3 Ê |y| $ h(a). Suppose these solutions 


have denominators c3 Í yo Í yi Í ... S y, Í h(a). By the Gap Principle, we have 
gi (4/2x)-1 > _2_ a/s 


and so 
2 (d/3)v—1 
h(a) 2 y 2 (az) . 


This gives a bound on the number of denominators, namely 


logt (log h(a)/ log (2/K'/)) 
log (d/3) l 


Since h(a) < H, (i.e. h(a) £ c4H), we have 


v+1[2+ 


+ 1/2 
y +1 < 24 28 (log H/log (2/K1)) 
logd 


Furthermore, 


log H/ log(2/K*/?) < (log m + log(1/K))/log(2/K*/?) 
< 1+ (log m/log(2/K)) , 
and we see that the number of solutions in this case is 


log* (log m/log(2/K)) 


«1+ eed 


For part (a) of the theorem, it remains to count the solutions with |y| £ c3. This 
will be done after stating Lemma 12C, which follows shortly. 
In part (b), we have K > 1. As before, we consider the inequality 


d 71/2 d 71/2 
z c3 K c K 
la 7 < LAm < paga TS (12.2) 
If y > cg K°?/4, then 
T 1 


Again by Theorem 9A of Chapter II, the number of solutions with y 2 h(a) is < 1. 
Thus it remains to count the solutions with y < max (cgK°7/4, h(a)). 
We rewrite the inequality in (12.2) as 


T 
a-— 


ak? A A 


KE D = 
yal? yil ats” 


(12.3) 
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where 6 = (d/2) — 2. By a variation of Lemma 8C of Chapter II, we know that the 
number of solutions to (12.3) with W S$ y < W© , where W 2 (4A)'/5, is 


log (C log W) 


1 
OPT) 


In our particular case, we want to count solutions with 
(4A)'/® Sy S max (esK/4, h(a), 
so we let W = (4A)!/® and 


_ log cg Ker/4 log h(a) 


T log(4A)1/6 mn = Jog (4A)1/6 ` 


In the first case, in view of A = c2 K1/?, we have C <1, and in the second, 
C < logh(a) $ log (c2łH) < d+ logm. We also have 


log W = log (4A)'/5 <1+4logm, 
and the bound on the number of solutions becomes 


logt log m 
log d 


In case (b), we are left with solutions with [y| < (4A)!/° $ e7K'/?8 = eg KM(4-4) | 
Thus by the way Theorem 12A (b) was formulated, we are finished with this case. 
In case (a), it remains to count solutions with |y| £ c3. We will finish this after the 
following lemma. 


«1+ 


LEMMA 12C. Let p(X) = AX? + BX! +C be a polynomial. The real numbers 
€ with 
IRCE) £ m 


fall into the union of at most eight intervals of total length 


Eir ( m i ear ( an a ( E p 
TAI 1 | Tp] ťa TAI z 
|A| |B| lal 


The proof will not be given here, but the following picture may give the reader 
some indication as to what goes on. 
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P(£) 


For a proof (using only calculus) see Mueller and Schmidt (1987, Lemma 7.1). 
In particular, if |A| 2 1, the total length is 


m \2/4 mil? mi/2 
<1+min (5) > [BIG?)-G7a)" IC |Q/2)-G/4) j ` 


Now we are able to complete the proof of part (a) of Theorem 12C. Recall that we 
left open the number of solutions of |F(z,y)| $ m where |y| Ê c3 (or when |z] $ c3). 
For given y, consider the polynomial 


p(X) = F(X,y) = aX? + by’ X" + cyf 
= AXİ + BX! 4C, 


with A = a, B = byt, C = +cy?. We have K < 1, so that m < H!~@/%, and then 
mi? < H0/2)-0/4) | For each y Æ 0, the lemma tells us that the number of solutions 
is < 1. When y = 0, the only possible primitive points are (1,0) and (—1, 0). All 
together, we have < 1 solutions in this case. 

We may now prove the main theorem, which we state again for the reader. 


THEOREM 7A. If F(X,Y) is an irreducible trinomial of degree d 2 9, then the 
number of solutions of 
|F(z,y)| £ m 
is 
< m?li, 
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Proof. As in section 2, it suffices to show that 
P(m) « m?/4, 
where P(m) denotes the number of primitive solutions. By Theorem 12A, we have 


logt logm 


oi) Sama er aa 


provided that K < 1 or min(|z], |y|) 2c; K1/(¢-® , where K = m/H'-@/%, So we 
are left with the case where K > 1 and min(|z|, |y|) < c:.K1/(4-*) . 

Say 1S |y| £ cı K1/(4-4), It suffices to consider 1 < |y| S cı m1/(4-) < eym?/4 
if 
d 2 8. Given such a y, the number of solutions in z is (by Lemma 12C with A= a, B= 


by‘, C= cy’) 
m \ 14 mi/2 
<xemin( (iq) > pen 


1/2 
< : 1ja _™ 
S1+min(m 3 ast) 


So for given y, the number of solutions is < m!/4, and there are < m?/4 solutions 
with 1 Ê |y| £ m!/4 . Now it remains to count those solutions with 


mil4 < ly] < cym?2/4 ; 


Their number is 


2/d a 12.4 
<m”! + a mep (12.4) 


mifé S |y| $ erm?/4 
We estimate the sum by considering the integral 


eym?/4 1 , 7 
1 
i: aya dy < mi-i) = mł. 
y 


mild 


It is now easily seen that the sum in (12.4) is < m?/4, and the proof is complete. 


813. Generalizations of the Thue Equation. 


Let K be an algebraic number field of degree [K : Q} = 6. As usual, let M(K) 
denote the set of absolute values on K and let S be a finite subset of M(K), containing 
all of the Archimedean absolute values. Let s = card (S) and for a € K, let jalsg = 
Ives jag” s 

An element a € K having |a|, Ê 1 for every v ¢ S is called an S-integer. These 
integers form a subring Os of K. The units of Os form a group Us consisting of a € K 
with |a|, = 1 for every v ¢ S. 

Consider the special case where S = Mæ( K), i.e. the set of Archimedean absolute 
values of K. If the number field K has r; real embeddings and rz complex embeddings, 
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then s = rı +12. In this case Os = D, the ring of integers in K, and Us = U, the 
group of units in K. By a generalized version of Dirichlet’s Theorem, we know that Us 
has rank s — 1, not just in the special case, but also for general S as above. The torsion 
part of Us is finite and consists of the roots of unity in K. (The reader may find a proof 
of this version of Dirichlet’s Unit Theorem in Borevich and Shafarevich (1966)). 

Let F(X,Y) € Os[X,Y] be a homogenous form of degree d 2 3. Suppose that F 
has no multiple factors. Consider the generalized Thue equation 


|F(z,y)|s =1 


in variables x,y € Os. Two pairs (x,y) and (z',y’) are called equivalent if z = 
z'n, y = y'n with n € Us. If (z,y) is a solution to the equation, then every pair which 
is equivalent to (x,y) is also a solution, since |n|s5 = 1. 


THEOREM 13A. With conventions as above, the number of equivalence classes 
of solutions to 


|F(z, yIs =1 


í (4s)* (4d)?6* : 
This result, which will not be proved here, is due to Bombieri (1985). 


COROLLARY 13B. The number of equivalence classes of solutions to 
F(z, y) E€ Us 


£ (4s)? (4d)?6* i 


Now consider solutions to F(z,y) = 1 with z,y € Ds. If (z,y) and (z',y') are 
solutions in the same equivalence class, then they differ by a factor 7 € Us. Since (by 
F(x,y) = F(z',y') = 1), n? = 1, we have the following corollary. 


COROLLARY 13C. The number of solutions of 
F(z,y)=1 


with z,y € Os is 
Í d(4s)?* (4d)? . 


Now consider the equation 


F(z,y) = pi -p7 
where F(X,Y) € Z[X,Y] and pi,... ,p, are distinct primes. We seek solutions z, y, 
Z1,...,2, E€ Z with 
gcd (x,y, pi) =1 (8d) 
This is known as the Thue-Mahler equation, since it is a generalization of the Thue 
equation which was first studied by Mahler (1933). 
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Example. We may seek integer solutions (z, y, z) of z?-2y? = 5* where gcd(z,y,5) = 
1. 

For the Thue-Mahler equation above, take S = {00,p1,... ,py} with s =v +1. As 
in Corollary 13B, look for solutions to F(z, y) € Us with z,y € Os. The corollary gives a 
bound on the number of equivalence classes of solutions. Two equivalent solutions differ 
by a factor in Us, which consists of pj’! ... p’”. Because of the condition ged(z, y, pi) = 
1, these factors can only be +1, and we have the following result. 


COROLLARY 13D. The number of solutions of the Thue-Mahler equation 


F(x,y) = pj" -p7 
is 
< 2 (4v +4)? (4de) f 

This corollary includes the case F(z,y) = m, where m = pj’... p” is given. In 
this particular case, we know that, e.g., the factor 26 in the exponent is unnecessary. 
For lower bounds see Erdös, Stewart and Tijdeman (1988). 

Next, we consider the Thue-Mahler Equation in terms of ideals in Os. We know 
that the non-Archimdean absolute values in S come from prime ideals, say B1,... , Pe. 
Then the group of units Us consists of a € K* which generate a principal ideal (a) of 
the form (a) = PI... Pit . Above, we considered the equation F(a, y) € Us, but this 
can also be written as the generalized Thue-Mahler equation 


(F(z,y)) = By -P 


in unknowns z,y in Og and z,... , 2 in Z. 
Now consider the special case 


zı Zt 


(z? — wy?) = BF... PF, 
where w is a given coefficient. Evertse (1984a) studied a variation of this, namely 


(x4 — wyt) z z 
——_— = Pi.. Br 13.1 
(x4, wy?) 1 E 3 ( ) 


where (a, 8} is the fractional ideal generated by a and £. In this setting, one studies 
solutions (z,y) € P}(K) and z,... ,z¢ € Z. Evertse obtained the following result. 


THEOREM 13E. The number of solutions of (13.1) is no greater than (c,d)* , 
where c; is an absolute constant. 
(Here s = card S equals t plus the number of Archimedean absolute values). 


IV. S-unit Equations and Hyperelliptic Equations. 


References: Evertse, Györy, Stewart and Tijdeman (1988), Shorey and Tijdeman 
(1986), Schlickewei (1977). 


§1. S-unit Equations. 


As in the end of Chapter III, let K be a number field with [K : Q] = 6. Also, 
M..(K) c S C M(K) and card S = s. An S-unit equation is one of the form az + 
azy = 1, where non-zero a},a2 E€ K are given, with unknowns z,y € Us. 

Consider the concrete example where K = Q and S = {oo,py,... ,py}, with the 
equation 

z+y=l1. 


So we are looking at 
Wy 


tp RPE San ESN, 


where ziw; E Z, (¢=1,...,v). Write z =2'/z' andy = y'/z' with gcd (z',y',z') = 
1. Then we have 


a! + y' = z! : 
or 
tpi ...pi + pi ...pt + pit...p’ =0, 
where z;,w;,t; (i = 1,...,v) are nonnegative integers and the summands have no 
common prime factor. (Thus min (z;, wi,t;) = 0 (i =1,... ,v)). 


Example: + 27137? + 243? 4+ 243% =0. 


THEOREM 1A. An S-unit equation a;z + agy = 1 has at most a finite number 
of solutions. Furthermore, if a,,a2 € Us, then their number is 


< 3255 ; 


where c is an absolute constant. 


Remark. We may force a1, a2 € Us by enlarging S if necessary. This may change 
the bound, though. In section 2, we will give bounds which do not require a1, a2 € Us. 
We follow Evertse, Györy, Stewart and Tijdeman (1988). 


Proof. We prove the second assertion, since it implies the first. By a generalized 
version of Dirichlet’s Unit Theorem mentioned in Chapter III, the group of units Us 
has the form 


Us =Z®...® ZE (Torsion) , 


| s-—1 times— 


where the torsion part is a finite cyclic group consisting of roots of unity. 
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Let UŻ be the subgroup of Us consisting of elements of the form uf where u € Us. 
The quotient group Us/U¢ has cardinality £ d*-'. d = d° . Let G,... ,¢¢ be coset 
representatives, where £ £ d°. Then any z,y € Us may be written as 


r=Xx? and y=? 5 
where X,Y € Us and 1 S$ i, j £ £. The S-unit equation ajr + azy = 1 becomes 
aiGX* +026; Y’ =1. 


Applying the argument for Corollary 13C of Chapter III with any given d 2 3, but 
noting that factors 7 with nê do not affect z = €;(Xn)* and y = €;(Y,)*, we see that 
the total number of solutions is 


í g (4s)? (4d)*6s < d?*(4s)** (4d)*°*, 
For instance, if d = 3, we get the bound 
328 (12)?6 (4s) < sia) 428 < s28 12309 | 


So we have used the fact that a Thue equation has only finitely many solutions, to 
show that an S-unit equation has only finitely many solutions as well. One can see that 
the converse is also true, i.e. that the finiteness of the number of solutions of S-unit 
equations implies the finiteness for Thue equations. We consider the Thue equation 


F(z,y)=m, 


where F(X,Y) € Os[X,Y] is homogeneous with at least three non-proportional linear 
factors and m € Dg. 
In some extension, F factors as 


F=1L...La, 


where the L; are linear with coefficients in a field N > KDQ. Let S c M(N) 
consist of the extensions of the absolute values of S to N. We can assume that m and 
the coefficients of L; lie in Os.. Consider the equation 


Ii(z)...La(z) =m (1.1) 


and seek solutions z,y E€ Os: . 
Take linear forms L4, L2,L3 which are non-proportional. Since these are three 
linear forms in two variables, they must be linearly dependent. We have 


D3 = Ài Lı + àz L2 with Ai, A2 € N* ; 
and 


Li(z) X L (2) B 
La(z)  ” L(g) 


1 
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By (1.1) above, we know that L;(z) | m in Os . Up to equivalence, there are only 
finitely many divisors of m. Say Li(z) = pu , where u € Us/, and there are only finitely 
many possibilities for p. Then 


where u € Us and there are only finitey many choices for pı. Similarly, 


L3(z) 


= p2 U2. 


This gives 
Ai pit + à2p2u2 = 1 


with uj,u2 € Us . This is an S'-unit equation in u;,u2, and it has only finitely 
many solutions. Thus there are only finitely many possibilities for L,(x)/L3(z) and 
I2(z)/L3(z). But these quotients determine z up to a factor of proportionality. It now 
follows immediately that (1.1) has only finitely many solutions z. 


§2. Evertse’s Bound. 


Let K be a number field with [K : Q] = 6 and S C M(K) with card S = s, as 


before. Evertse (1983) gives a bound for the number of solutions of the S-unit equation 
QAT + æy = 1 


which is independent of the coefficients œ1, œz and which does not require that a1, a2 € 


Us. 
THEOREM 2A. (Evertse (1984)) The equation 
aiz + azy = 1, 
where a1,&z € K are non-zero has at most 


3 à 78t2s 


solutions z, y in Ug. 
Here, we will prove less, namely that the number of solutions is 


S c1(8)c3. 


For instance, if K = Q, then the number is S c3. 
First we reformulate the problem in projective space. That is, we look at solutions 
to the equation 


aiz t+agy—z=0, 
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where (z,y,z) € P?(Us). Or let a3 = —1 and consider 
aıt +agyta3 z=0, 
which may be rewritten as 


yi tyty =O. 


Then we consider solutions where y;/a; € Us (i= 1,2,3), and proportional solutions 
are considered the same. 
Let a = (a1,02,a3) and y = (y1,y2,y3). As in Chapter I, let (€), = |€|fv, and 
define = ; 
ees Il (a)y i 
ag (010203)» 


Since |y;/ai]) = 1 for v ¢ S, we have 
(yzy) m 1 1 (as (yr yeys)» 
Uy: Ug i I (a ) 


ves Z/y ves wgs (Y1Y2Y3}v vgs > (a10203)» 


=A Hi(y)* : 


Given y and v, let (iv, jy, ky) be the permutation of (1,2, 3) with the property that 
lyst» S ly le £ lye.le. If v is non-Archimedean, then |y;, |]. = lye, |» since yi, + yj, + 


Yk, = 0. If v is Archimedean, then |yr, |» $ 2|y;,|o and lylo £ V6ly;, |v. This gives 


lyinlo < . \yry2ysle 
a s,s 
lylv \ylè 


’ 


where 
| 1 if v is non-Archimedean, 
Cy = 


V6 ifv is Archimedean. 


Then we have 


I] Fah < 3° A Ay(y)*. (2.1) 
vES Zey E 


This ties in with Roth’s Theorem in its generalized form, that is, Theorem 10B of 
Chapter II. We consider the variables as y1, y2, and we have the linear forms 


yı if ty =1 
Ly=¢ y if n= 
— Yi ~— y2 if 7,=3 
We know that 


grr) 


max (lyilo, [y2ly, Iyı + yale) S max (|yilv, [yale); 
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where 


1 if v is Archimedean . 


0 if v is non-Archimedean 
A(v) = 


Letting z = (yı, yz), the inequality becomes 
(e)s 2 2-0 yy 


Then (2.1) yields 
L,(x))y 
Il alae < 6° A Hx(z)° (2.2) 
LL yw. 
since (Ly), 2 1. 
To count solutions, we apply Theorem 10B of Chapter II. We get 


= c3(6, t, £) ca(e)* 


solutions with 
Hx(z) > e1(5,t,€) (C + H +1209 , 


where t is the number of distinct linear forms L,, so that t = 3, and e = 1, C = 129A. 
So we have 
£ c3(6)c§ 


solutions with 
Hx(z) > e1(8) (129A +241)? | 


Now we are left to count small solutions. We have 
Hx(z) £ Hx(y) $ 2°Hx(z) , 


and there are two possibilities. If A $ 12!°, then Hx(y) S cs(6) . There are only 


finitely many solutions in this case, say S cg(6). Then we are left with the case where 
A > 121°, and we have to count solutions with Hx(y) < A‘), To do this, we need 
the following lemma. = 


LEMMA 2B. Suppose 0 < y < 1 and s € Zt are given. Then there is a 
finite set © of s-tuples (Ti,...,0.) of non-negative reals with Ti +... +T. = 7; 
such that for every z = (21,...,2,) with x; 2 0, there is an s-tuple from © with 


zi Z Pil2y +... 2s) (i =1,... ,s). Furthermore, card § cg(y)°. 


Exercise 2a. Prove Lemma 2B. 
In the case s = 2, the elements of G lie on the line Ty + T2 = y. We have the 
following picture. 
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(ea (Ti, T2) 


It is easy to see that one may cover the line zı + z2 = 1 with a finite number of 
sets o = o(T1, I2) consisting of (z1, £2) with z; 2 T; (i = 1,2). The lemma then 
follows for the case s = 2. 


Remark. As a consequence of this lemma, we have that for z; £0, (i = 
1,...,n), there is some (T1,... ,I's) € G such that z; $T(2i+...42,) (¢=1,...,8). 

To finish counting solutions, we will apply the lemma with y = 5/6 and s = card S. 
Then for every y = (y1, y2, y3), there will be some (T1,... , P's) € G with 


(yin )o (Yin)o Py 
(y)v i (1 (y)v ) a 


vES 


for every v € S. (Notice that we have changed the subscripts of F; (i = 1,...,s) to 
Te (v € S).) Using the estimates (2.2), (2.3), we have 


(Yin dv cays 
7A < (6°A Hx(y) ) (ves). (2.4) 


We subdivide the set of solutions into those with fixed {I, }ves, which gives S cg(7)* = 
cg classes. We then further subdivide into classes for which 7, is fixed for every v € S, 
which gives 3° subclasses. All together, we have cf) classes. In the next lemma, we 
establish a Gap Principle which allows us to count the number of solutions in a fixed 
class. 


LEMMA 2C. Consider solutions y, y' to (2.4) which lie in a given class. Suppose 
Hx(y) = Hx(y’). Then phe 


Hx(y') 2 12-* als Hx(y)? , 
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Proof. Write y = (y1,y2,ys) and y' = (y{,y},43). Forv € M(K), form the 


“determinant” ; ; 
_ (iy = EVE de 


bda (y)v (y’)» 


where i Æ j. (Since > y; = SD y! = 1, it doesn’t matter which pair i Æ j we choose). 
In particular, for v € S, take i =i,. Then 


< 9X(v) ny max (Yis )v (yi, v 
A, £2 (Ys Whe ), 


where 


Xv) 1 if v is Archimedean 
i 0 if v is non-Archimedean. 


Then, by (2.4), we have 


Ty 
A < Orv) nv (sr Hx(y)”) , 


where $ cs» = 5/6. Taking the product over v € S, we get 


[J 4. <2 (ora txt) (2.5) 


ves 


Now, for v # S, we have |y;/a;|, = 1 since y;/a; € Us. A similar result holds for 
y’, so then 
(yi)w a CAE = (ai)», (i = 1,2, 3). 
If we pick 7,7, k such that 


then 
A. < (aide (a)y _ (aiajar) 
"= (a)? (a) 
where the inequality holds since v ¢ S is a non-Archimedean absolute value. Taking 
the product over v ¢ S gives 


A $ (aiajar)o _ eD 
Lei a 


In conjunction with (2.5) this yields 


1 


a = A, Í 12° Avs H Ske 
Hx(y) Hx(y') Il = K(y) 


v€M(K) 
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Lemma 2C follows. 

We return to the proof of Theorem 2A, where we were left with solutions having 
Hx(y) $ A‘7(®) , Furthermore, we may initially restrict ourselves to solutions in a given 
class, as explained above. For two such solutions Y, y', the Gap Principle of Lemma 2C, 


along with A 2 12125, gives the inequality H x(y! ye > AMI KY’. 2, Suppose some 
fixed class contains solutions Yp Ppor, with non-decreasing heights bounded above 


by A°7(®). Then by the Gap Principle, 


v-l 
Hxy ) > Ai2 „Hry ) > (ABD 


But Hg(y ) $ A%7, so we have 
=y 


1 3 v-1 
D G) í c7(6) : 
and v < c11(6). 


Recall that the number of classes was not greater than cj, so that we have no 
more than c1;(4)c{, solutions with height bounded by 4°‘, Combining this with the 
other estimates, we get no more than c;(6)c§ solutions to the S-unit equation, where 
6 = [K : Q] and s = card S. Our proof essentially followed the method of Evertse, 
Gyéry, Stewart and Tijdeman (1988). 

Evertse’s proof of Theorem 2A used hypergeometric functions. These are advan- 
tageous when studying rational approximations to radicals, i.e. numbers of the type 


ir. 


§3. More on S-unit Equations. 


In sections 1 and 2, we considered S-unit equations in two variables. We reformulate 
this in projective space by considering the equation 


Qı 21 + a2 T2 +03 23 =0, 


where a1, @2,03 € K* are given and the solutions (z;, 22,23) lie in P?(Us). Even more 
generally, we may consider 


Qozo + 042, +... + Anta =0, 


where a, € K* (i =0,...,n) and where we seek solutions (zo,... ,2n) € P"(Us). 

If n 2 3, we no longer necessarily have only finitely many solutions. Consider, 
for instance, the following example. Let n = 3 and consider solutions of the equation 
Tı +22 +23 +24 = 0, where S = {00,2} and the number field K = Q. There are 
infinitely many solutions of the form (2", —2",2™, —2™) where m,n E€ Z. 

However, with suitable restrictions on the solutions, an S-unit equation will have 
only finitely many solutions. We see this in the following theorem due to Evertse (1984b) 
and Schlickewei and van der Poorten (1982). 
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THEOREM 3A. An S-unit equation has only finitely many solutions in P"(Us) 
for which no subsum vanishes, i.e. for which aj, £i, +. --+@i, £i, # 0 for any {i1,72,... it} 
C {0,1,...,n} witht £0,tAn41. 

The proof, which will not be given here, (but see Ch. V, §2), uses results on 
simultaneous diophantine approximations which involve both Archimedean and non- 
Archimedean absolute values. Mahler (1933) was the first to study diophantine approx- 
imations using non-Archimedean absolute values. 

For the remainder of this section we return to the case n = 2, considering equations 
of the type 

Qo + aTi + azz = 0 (3.1). 


Evertse, Gyory, Stewart, and Tijdeman defined an equivalence relation on S-unit 
equations. A slight variation on their definition is that two equations (3.1) and 


aro + air + abr, =0 (3.2). 


are equivalent if 
at = E50; (i = 0,1,2), 


where c; € Us and A € K*. Equivalent equations have the same number of S-unit 
solutions, for if £0, £1, £2 is a solutions of (3.1), then zo/£0, £1/€1, 22/€2 is a solution of 


(3.2). 
THEOREM 3B. (Evertse, Györy, Stewart, Tijdeman (1988)). Except for finitely 


many equivalence classes of equations, an S-unit equation over P?(Us) has at most two 
solutions. 


Remark. In general, there may be more solutions. Consider the S-unit equation 
x+y = 1. Nagell (1969) proved that when 6 2 5, there are number fields K of degree 
6 such that this equation has at least 3(26 — 3) solutions in units of K, i.e. with 
S = M..(K). In a more recent result, Erdos, Stewart and Tijdeman (1988) proved 
that for K = Q and s arbitrary, there are sets S of cardinality s such that the S-unit 
equation above has at least ecs" ?/ logs solutions. Stewart made a case that e°” is the 
correct order. 


Remark. The number 2 in Theorem 3B is best possible, provided s > 1, so that 
Us is infinite. To see this, pick én € Us, with € #7, € #1, n # 1. Consider the 
equation 

aıt + &2y = 1, 

where a1 = (7 — 1)/(n — €) and œz = (£ — 1)/(€ — n). This has solutions (1, 1) and 
(€,). Since there are infinitely many choices for €, n, there are infinitely many S-unit 
equations of this particular form. However, only a finite number of these belong to a 
given class. To see this, suppose 


ait +azy = 1 


and 
aiz +ay=1 
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lie in the same class. Then aj = a1é, and ah = a2€2 where €1,€2 € Us. Then the 
second equation becomes ajé1% + a2é2y = 1. Since (1,1) is a solution of this new 
equation, we have 

&1E1 +a2E2 = 1i 


But this is an S-unit equation in the unknowns €1,£2, thus it has only finitely many 
solutions. Therefore, we have infinitely many equivalence classes of equations with at 
least two solutions. 


Proof of Theorem 3B. We are considering S-unit equations of the form 
ar + By+7z=0, 


with solutions (x,y,z) € P?(Us). Every equivalence class of equations contains an 
equation of the type 
az + ßy+z=0. 


Now suppose that we have three distinct solutions, say (z;, yi, zi) (i = 1,2,3). 
Thus 
az; + yi+zi=0 (i= 1,2,3). (3.3) 


Because the solutions are distinct in P?(K), any two of the triples (zi, yi, zi) are non- 


proportional. So we have 
rank V1 2) = 2, 
T2 Y2 22 


Ti Yi 
T2 Y2 


and then 
# 0, (3.4) 


by (3.3). Therefore, r1y223 Æ 24123. Cyclic permutations give two more relations, 
namely £24321 Æ %3y221 and 23y122 Æ 21y32z2. All together, considering y,z or z,z 
in place of z,y in (3.4), we get nine relations. Also, the three solutions to the given 
S-unit equation satisfy (3.3), which may be interpreted as a system of three equations 
for a, p, y. The matrix of this system must be singular: 


Ti Yy 71 
T2 Y2 z72| = 0. 
T3 Y3 23 


Writing out this determinant, we have 
T1Y223 + T2Y321 + T3Y122 — T1322 — T2Y123 — T3y221 = 0, (3.5) 


and the nine relations above impose the additional condition that no term with a + 
sign can equal a term with a — sign. 
To finish the proof, we need the following lemma. 


LEMMA 3C. There are only finitely many possibilities for the ratios 1122/2221 
and ¥122/Yy221. 
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Proof. The relation (3.5) is about a sum of S-units, i.e. an equation 
a; + a2 + a3 — bi — bz — bs = 0, 


with a;,b; € Us (i = 1,2,3). By Theorem 3A, this equation has only a finite number 
of solutions in P°(Us) for which no subsum vanishes. We consider several cases. 

case (i). Suppose no subsum vanishes. Then we have only finitely many possibil- 
ities for 


by /ag = 21 Y322/22Y321 = 2122/2221 
and 
a3/b3 = r3y122/z3y271 = y122/9221. 


case (ii). Suppose a; + a2 + a3 = 0 and bı + bz + b3 = 0. For each of these 
subsums, we may apply Theorem 3A to see that there are only finitely many possibilities 
for a;/a2, a3/a2, bı/b2, bı/b3. Then there are only finitely many possibilities for 
the product a1a3b? /a3b2b3, which simplifies to (z1z2/2221)*. So we have only finitely 
many possibilities for 1) z2/x22), as desired. By symmetry, we get the same result for 
yiz2/Yy221. 

Again by symmetry, we are left with the following two cases. 

case (iii). a) + az = 0 and a3 — bı — bz — b3 = 0. 

case (iv). aı + ag — bı = 0 and a3 — bz — b; = 0. 


Exercise 3a. Establish the result for cases (iii) and (iv). 
We return to the proof of Theorem 3B. From (3.3) with ¿ = 1,2 we obtain 


Jn za Ti 21 
Y2 22 T2 22 
=- 2, B= ; 
Yı Tı Ti Yı 
yo T2 T2 Y2 
so that 
ač — @Mz/nza)-1 g1 _ (2122/2221) -1 


zı (z2y1/£1y2) -1° z (zıy2/z2y1)-1` 


Thus, by Lemma 3C, there are only finitely many choices for azı /z1 and By; /z1. Since 
z1ı/z1, yi/21 € Us, there are only finitely many possibilities for a, 8 up to equivalence 
classes (i.e. up to S-units). Therefore, there are, up to equivalence, only finitely many 
S-unit equations with more than two solutions. 


Remark. This argument can not be generalized to give a similar result for S-unit 
equations in four variables. 


Remark. The number of exceptional equivalence classes has not been estimated, 


although such an estimate could perhaps be derived from recent bounds on the number 
of solutions of S-unit equations in n variables. 


§4. Elliptic, Hyperelliptic, and Superelliptic Equations. 
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We consider equations of the form yf = f(z), where d È 2, deg f = n È 2 and 
A(f) = discr(f) # 0. The case where d = n = 2 will be excluded. The polynomial 
f may have its coefficients in various fields or rings, for example, f(X) € Q[X], or 
f(X) € K[X] where K is a number field, or f(X) € Os[X] where Ds is as before. 

In the case d = 2, n = 3, we have elliptic equations, which have the form y? = f(z), 
where f is a cubic polynomial with distinct roots. When d = 2 and deg f = n 2 3, 
we have hyperelliptic equations. The most general case, namely yf = f(z), where 
d 2 2, n= 2, but n and d not both 2 is called a superelliptic equation. 

In the case of hyperelliptic equations, the genus of the Riemann surface is g = [(n— 
1)/2]. For suppose the equation is written in factored form as y? = a(z —a@1)...(£— æn). 
Consider the case where n is even, say n = 2m. Since n is even, we may pair the roots, 
as shown, making a cut between each pair. 


a 
2 Q4 On 
el 
€2 
ay a3 


An] 


We see that y is an analytic function in this cut plane. In other words, we have 
two Riemann spheres with m cuts in each. 


dy 
1 on > A 
æ <> a2 dz 
e2 
Qo 
Oo 


Identifying edges of the various cuts, we get a single Riemann surface which is 
homomorphic to the surface of a pretzel with m — 1 “holes”. We have to identify, e.g., 
the upper edge e} with the lower edge d2, and the lower edge e2 with the upper edge 
dı. Thus the genus is m — 1 = [(2m — 1)/2] = [(n — 1)/2]. The proof is the same for n 
odd, except that the last root is paired with the point at infinity on the sphere. 


Exercise 4a. Consider the equation y? = f(z), where f is a cubic polynomial 
with distinct roots. Show the genus g = 1. 
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THEOREM 4A. A superelliptic equation with coefficients in an algebraic number 
field K has only finitely many solutions z,y € Og. (Here S C M(K) and Os are as 
before. ) 

The special case of an elliptic equation was done by Mordell (1922). The general 
case, which is due to Siegel (1926), was published under the pseudonym X. 

In the proof it is necessary to consider an extension field K' > K. We then choose 
S' C M(K') large, in particular so large that it contains every extension of elements of 
S to the field K'. Then if z € K is an S-unit, it is also an S’-unit. For if |z|, = 1 for 
every v ¢ S, then |z|,, = 1 for every w ¢ S'. 


Proof. In some extension field K', the polynomial f factors, i.e. 
y? = a(z — a1)... (£ — an), (4.1) 


with a, a; € K', provided K' is sufficiently large. Furthermore, if S’ is large enough, 
then a,a; € Os. By a change of notation, we may suppose that K,S have these 
properties. We will use the following lemma to rewrite the factors z — aj. 


LEMMA 4B. There exists a finite set B of non-zero elements of K such that for 
any solution x,y to (4.1) with y £0, we have 


z — a;i = fiy$, 
where ĝ; € B and y; € Ds. 


Proof. 
(i) There is a finite subset £1,... ,&m in Us such that every u € Us may be written in 
the form 
u= ĉiu"? 


where 1 Ê i $ m and u' € Us. 


(ii) Let P be the set of prime ideals in Os which either divide a; — a; for some i # j or 
divide a. Then P is finite. Suppose some prime ideal $ divides x — a; and z — a; 
fori 4 j. Then P EP. 

(iii) Say P consists of 81,... , Pe. If B is an ideal whose prime factors lie in P, then 
B = PP... PH (PY... PH), where 0 < c; < d fori =1,... ,é. In other words, 
B = %*Q! with only finitely many possibilities for B*. 

(iv) We know that the class number h of the ring Os is finite. This means that there 
are certain fractional ideals 1,... ,, such that every ideal has the form 


D(z), 


where (z) denotes the principal fractional ideal generated by z € K and where 
1S ih. If the D; are properly chosen, then any integral ideal D will have the 
form 


D= 9;(z}, 


where now z € Os. 
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We now combine these observations. For x — a; given, write (z —a,;) = %;%B;, where 
A; consists of prime factors which are not in P and B; consists of prime factors which 
are in P. Then we have 


(W)? = (a) (z ~ an)... (2 ~ an) 
= (a) Ay ...U,8,...Bn, 


where each A; is coprime to the other factors on the right-hand side by remark (ii). But 
the left-hand side is a dth power, so each A; must be also. Using this fact and (iii) from 
above, write 


A4=€/ and B; =H. 


Then 
(z — ai) = BI(Q:Ei)4 , 


and we noted that there are only finitely many possibilities for the BY. By (iv), we may 
write 


QE; = Dj, (zi) 
with z; € Og. Taking dth powers and noting that (x — a;) is a principal ideal in Ds, 
we have 

(z — ai) = (&) (2f), 

with only finitely many possibilities for the principal ideal (€;). Then 

z-a; = ulizi, 
where u; € Us, and €; lies in a finite set. By observation (i), we have 

ui = Ck; uñ, 

with only finitely many possibilities for ¢;;. So finally, we get 


x — a; = (Ce, €i)(uizi), 


which gives the result with 6; = ¢;,€ and y; = ujz. 
We return to the proof of Theorem 4A, considering two cases. 
In the first case, we suppose d 2 3, n 2 2. Using the lemma, write 


z—a = fry?, 


z-a = Boys. 
Then 
biy? = Boys = a2 —a; £0, 
which is a Thue equation in the variables y1, yz since d 2 3. So we have only finitely 


many solutions y;,y2. Since z is determined by the ;’s and y;’s, we have only finitely 
many possibilites for z. 
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The remaining case is when d = 2 and n 2 3. As above, write 
2 
T — Q = biyi 


z — a2 = biy, 
T — Q3 = bay. 


We need to solve this system of equations in z, y1, Y2, Y3 € Ds. 

First, we extend K so that it contains V}, V32, VØ. Then the right-hand 
sides will be squares, i.e., let z; = VJ; yi so that z — a; = z? (i = 1,2,3). Letting 
y3 = a2 — a #0, and permuting the indices to get 71,72, we have 


2  — 
21 — 22 = V3, 


2 2 
Z2 — 23 = Vis 
2 2 

23 — 2) = %2 


Now the left-hand sides can be factored. We have, for instance, 


(21 — 22)(21 + 22) = 73. 


We write 
21 — 22 = p3u3, (4.2) 
where ug is a unit and (since zı — zz divides y3) where we may take p3 from a finite set. 
We also have 
22 — 23 = piu, 
23 — 21 = p2ue2. 


Adding these last three equations gives 


Pit + p2u2 + p3u3 = 0, 


an S-unit equation. Hence there are only finitely many solutions (u1, u2,u3) € P?(Us). 

We would like to know that there are only finitely many possibilities for the z; (i = 
1,2,3). Then it will follow that there are only finitely many solutions to the original 
hyperelliptic equation in this case. So we consider 


Yə 
p3u3 i 


zi +22 = (4.3) 


which in conjunction with (4.2) gives 


1 
= 5 (pus + i ) . 
2 p33 


Similarly, by cyclic permutation, 
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and 


Now the y; are fixed and we have only finitely many choices for the p;. There are 
finitely many possibilities for (u1, uz, u3) up to equivalence in P?(Us). So suppose that 
we replace u; by Au; (i = 1,2,3). Equating the two expressions for z2 gives 


1 
(pit + p3u3)A = -( ieee E 
Pity psuzjÀ 


so À is determined (up to +) unless piu + p3u3 = 0, which is impossible. 
In the next section, we will obtain estimates on the number of solutions. 


§5. The Number of Solutions of Elliptic, Hyperelliptic, and Superelliptic 
Equations. 


Here we discuss relatively explicit bounds on the number of solutions of the various 
equations. These results are the joint work of Evertse and Silverman (1986). 

Let K be a number field of degree 6 and K* the multiplicative group of K. Let 
S be a finite set of absolute values which contains all of the non-Archimedean ones, i.e. 
M..(K) c SC M(K), and let s = card S. As above, let Os denote the S-integers in K 
and Us the S-units. Consider polynomials f(X) € Os[X] with discriminant A(f) € Us. 
Notice that this last requirement is not much of a restriction, since we may enlarge S 
to force A(f) € Us. Then the cardinality s will reflect the number of prime factors of 
A(f). 

In what follows, L is an extension of K with degree [L : K] = £. We will also have 
d 2 2, and ha(L) will denote the order of the subgroup of the ideal class group of L 
consisting of elements [A] with [A]? = 1. We will count solutions of the superelliptic 
equation 


y? = f(z), (5.1) 
with z E€ Os, y #0, y € K. (Then automatically, y € Os). 
THEOREM 5A. 


(a) Suppose d 2 3, n 2 2, and L contains at least two roots of f. Then the number of 
solutions of (5.1) with z € Ds and y € K* is 


< 17&68+9) gE ha(L). 
(b) Suppose d = 2, n 2 3 and L contains at least three roots of f. Then the number 


of solutions is 
= 7(46+98) h2(L)*. 
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Remark. We may pick L with £ Ê n(n — 1) in case (a) and £ S$ n(n — 1)(n — 2) 
in case (b). Aside from the choice of L, the coefficients of the polynomial f do not 
enter into the estimates. In the case of an elliptic equation, one may conclude that the 
number of solutions is < c(e)H?+*, where H is the height of the equation. See Schmidt 
(to appear). 

Here we will prove a weaker form of the case (a). We will show that the number of 
solutions in (a) is 


Í (erd)** ha(L). 


We need several lemmas first. 


LEMMA 5B. Suppose|_ | is a non-Archimedean absolute value on a field E. Let 
the polynomial 


F(X) = an X" +... +a = a(X —a1)...(X — an), 


be given with a;, a; in E and |a;| £1 (i =0,...,n), and also |A(f)| = 1 where A 
denotes the discriminant. Then for every z € E with |x| £ 1 andi £ j, we have 


lai — o;| = max (|z — ail, |£ — a;l) 
= max(1, fai) max(1, |a;l). 
Proof. Because | | is non-Archimedean, we have 


lai — a;| £ max (|z — ail, |z — a;l) 


(5.2) 


HA TA 


max(1, |a;|) max(1, |a;|). 


We also know that 
[A(f)| = lan?" [] lai- |= 1. (5.3) 
fj 


By Gauss’ Lemma (and with the notation |f| = max |a;|) 
t 


lan| [[ max(1, Jail) = If] $ 1, 


i=1 

so 

jan?” [[ max(1, ail) max(1, læ;l) $ 1. (5.4) 

ižj 

Comparing (5.3) and (5.4) gives 

[] max (1, Jail) max(1, Ja,|) $ [I lai- a;l, 

ifj ižj 
and this is 


< [J max(1, jail) max(1, la;l) 
tj 
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by (5.2). We therefore have equality everywhere, in particular in (5.2). 

Now let v € M(K) such that v | p for a prime p. Then v extends the p-adic absolute 
value. In this case, the value group G, of v consists of powers n’ (£ € Z), where 7 is 
some fixed fractional power of p, say n = p'/*. 


LEMMA 5C. Let v,7 be as above. Suppose |z|» £1, |f|v £1, |A(f)|» = 1, and 
f(z) = yf, where y € KX. Then 


|z — ails = max(1, |a;|y) quid A 


where u; € Z (i = 1,...,n). In fact, u; = 0 with the possible exception of one value 
Uig: 


Proof. By the proof of the preceding lemma, we have 
n 
lan|» [[ a, lail) = {fly = 1. 
i=1 


Since f(x) € (K*)Ż, 


[f(x = |anl» II |z — ailv € Gt. 
i=l 


Then 


n 


|z — ail» d 
See ae E Gia 
Il max (1, |a;|,) ° 


Letting cj = |z — a;|,/max(1, |a;|,), we have 


n 


J «eG. 


t=1 


Now if |a;|y > 1, then c; = 1. If, on the other hand, |@;|) £ 1 and |a;|, £ 1, then 
\(x — aj) — (x —a;) |, = aj — a;|, = 1 by Lemma 5B. So only one of |z — a;l, |z — a;| 
can be strictly less than 1, thus only one of c;,c; can be strictly less than 1. Therefore, 
ci = 1 with one possible exception, and each c; € G4. That is, 


|z — ail» 


—.—-. eG! j=1,... 
max(1, lai») Ees PE an] 
as desired. 
As in Chapter II, Section 13, suppose there are t non-Archimedean elements of S. 
These absolute values correspond to prime ideals $1,... , Pı. Given fractional ideals 
A, B, we write 


A = B (mod S) 
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if A/B is of the type PP ... PF with integers c1,... cy. We write A = B (mod S, d) if 
A/B is of the type PË... P C4 where € is any fractional ideal. Consider the congruence 
in the variable z given by 


(z) = A(mod S, d) (5.4), 


where (z) is the principal ideal with generator z. If z is a solution and z' = zwł, then 
z' is also a solution. So it is valid to count solutions z € K*/(K*)?. 


LEMMA 5D. The number of solutions of the congruence (5.4) in K*/(K™)? is 


$ d ha(K). 


Proof. Suppose that there exists a solution z9. Then for any other solution z, we 
have 


(z/zo) = (1) (mod S, d). 


Thus, it suffices to count solutions z of 
(z) = (1) (mod S, d). 

Suppose z is such a solution. By the definition of the congruence relation, we have 
(2) = BE... pres, 


and without a loss of generality, 0 £ c; <d (i=1,...,t). 
We will count solutions z with fixed c,,... ,cz. Say zı is a fixed such solution, 


(a) = By... PRET 
and z is an arbitrary such solution, 
(z) = Po... pees. 


Then 
(2/21) =(€/€1)* . 


The ideal class of €/€), say [€/€1], has [€/€,]? = [1]. Also, if €, €; are in the same ideal 
class, then (z/z1) € (K*)*. So (since we only want solutions modulo (K*)*) all that 
remains is to count ideal classes whose d th power is [1]. But their number is ha(K) by 
definition. 

Allowing for all the possibilities for c1,... ,c¢ with 0 Ê c; < d (1=1,...,t), we 
have 


Í d ha(K) 
solutions in K*/(K*)?. 
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LEMMA 5E. Suppose that d 2 3 and is a fractional ideal. Consider solutions 
a E€ K* to the pair of congruences 


(a) = A (mod S, d), 
{1 — a) = (1,a) (mod S). 


The number of such a is 


$ (c1d)”" ha(K), 


where c is an absolute constant. 


Proof. Write a = wz‘, where w runs through a complete residue system in 
K*/(K*)*. Then by hypothesis, 


1— wz? 

which may be written as 
(1 wet) , 
Mea) ee 


For a given w, we count the number of solutions z. By Theorem 13E of Chapter III, 
the number of solutions is £ (c;d)* where c; is absolute. (Sorry about the occurrence 
of cı with two meanings.) 

But we also have 


(w) = A (mod S, d), 


where w € K*/(K™*)4, and by the preceding lemma, the number of such wis Í dt ha(K). 
Therefore, the total number of solutions is 


Í (cd) ha(K) . 


We are now ready to return to the proof of Theorem 5A, part (a). We are consid- 
ering solutions of the hyperelliptic equation 


y? = f(z) = a(z — a1)... (2 — an), 


where z € Os, ye K~. 
In (a) we had d 2 3, n 2 2, and L contained at least two roots of f, say a, a2 € L. 
Let S' be the set of absolute values of L which extend absolute values of S. 


For z € Os, put 
T — a 


Z(z)= 


T — Q2 


For v ¢ S', we have |A(f)|, = 1, so by Lemma 5B, we have 


Jay — a2|, = max (|z — arly, |£ — azlļv) 
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= max (1, ae) 3 
v |z — azļv 


|1 = Z(z)lv = max (1, |Z(z)|v), 


ay — a2 


Tt — a 
Ta 
T — Q2 


vú T — a2 


This means that for every v ¢ S' 


or in terms of prime ideals, 
(1 — Z(z)) = (1, Z(z)) (mod $’). 
Also, by Lemma 5C, for v € S’, we have 


|z — ailv 


— g = 1,2 
max (1l, |a;ly) eG, ( »2), 


so that 
max(1, jaile) a 


|Z(z)ļv = max (1, |or2|») “Ju 


with gẹ, € Gy. Now we have 
(Z(x)) = A(modS", d), 


where % is a certain ideal. By the last lemma, the number of possibilities for Z(x) is 


$ (c1d)?"* ha(L), since card (S') $ €card S = és, where £ = [L : K]. 


§6. On Elliptic Curves. 
Consider an irreducible polynomial equation 
f(z, y) =0 


and its associated affine curve, embedded in two-dimensional space. A point P on the 
curve is called a singular point if 


Otherwise, it is called a non-singular point. For such a point, the equation of the tangent 


line at P is given by 
of of 
2L “| \y= 
(> „JX (5 È i 


where C is a constant. Thus, at non-singular points, we have a well-defined tangent 
line. 
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If the total degree of f is d, then put 


F = d 

(z,y,z)=2*f (Z, o, 

so that F is homogeneous of degree d. We study solutions of 
F(z,y,z) =9, 


where (x,y,z) € P?. If the point (zo, yo) lies on the affine curve F(z,y) = 0, then 
(xo, yo, 1) lies on the projective curve F(z,y,z) = 0. Conversely, if (z0, yo, zo) lies on 
the projective curve and zp # 0, then (z9/z0, yo/zo) lies on the affine curve. In other 
words, there is a one-to-one correspondence between points on the affine curve and 
points on the projective curve with z # 0. Points on the projective curve with z = 0 
are called the points at infinity of the affine curve. We say the curve is of degree d if F 
is of degree d. 


Example. Consider y? = f(z), where f is a cubic with non-zero discriminant. We 
may write y? = a(x — a )(x — a2)(z — a3), and we see that the corresponding projective 
curve is y?z = a(x — a1z)(z — a2z)(z — azz). The points at infinity occur when z = 0, 
so x = 0 and y £0, i.e. the point (0,1,0) € P?. 


Example. Consider the cubic Thue equation f(z, y) = m and the corresponding 

projective curve f(z, y) = mz*. In factored form we have a(x—a ,y)(x—a2y)(z—a3y) = 

2%. If z = 0, then z = ayy for some i € {1,2,3}, so the three points at infinity are 
(ai, 1,0), (az, 1,0), (a3, 1,0) € P?. 

We will also need to talk about lines in projective space. Recall, a line in affine 
space is given by the equation az + by + c = 0, where a and b are not both zero. In 
projective space, this becomes az + by + cz = 0, where we require that a,b,c are not all 
zero. The additional line which appears, i.e. the line z = 0, is called the line at infinity. 

Recall, in affine space, we said the singular points on f correspond to 


aA z,y) = L Cay) =0 and f(z,y)= 
In projective space, we have (recall that F was homogeneous of degree d) 
dF(z, Y, z)= = LA y, z) ie O y, 2) F ZOT ea, Y, z) ’ 
so that singular points may reasonably be defined by 
Tena) = Flee) = Eey) =0. 


It is a simple lemma to prove that if an affine point is non-singular, then the corre- 
sponding projective point is also non-singular, and conversely. 
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Given a projective curve, we could specialize any of the variables to 1 to obtain 
corresponding affine curves. This is illustrated by the diagram here. 


affine curves 


projective curve F(z,y,1)=0 
F(z,y,z) =0 F(z,1,z) =0 
F(l,y,z) =0 


We may use these affine curves to study properties of the projective curve. 


Example. Consider the curve y? = f(z) with discr( f) # 0. Does it contain any 
singular points? If so, there is a solution to 2y = 0, f(z) = 0, f'(x) = 0, but this can 
not happen since discr( f) 4 0. What about the point at infinity? Is it a singular point? 
The projective curve is given by 


y?z = a(z — a12)(z — a22z)(x — 032) 


and the point at infinity is (0,1,0) € P?, as seen earlier. We check the partial with 
respect to z, and we see that 


f= 2 (ae - a1 2)(e — azz)(z — a2) 


which does not happen at (0,1,0) € P?. So the curve is non-singular, i.e. it has no 
singular points. 


Example. The projective cubic Thue equation f(z, y) = mz? is also non-singular. 


Bezéut’s Theorem. If C,, C2 are projective curves of degrees dı, d2, respectively, 
then the total number of their intersection points (counted according to multiplicities 
which are not defined here) is didz. 

In the special case of a cubic curve intersected with a line, the number of points 
of intersection is 3. Actually, it is not necessary to use the general version of Bezéut’s 
Theoerem to get this result. We return to the examples once again, considering inter- 
sections with particular lines. 


Example. Consider the cubic equation y? = f(z) and the line at infinity, z = 0. 
Their intersection is the point (0,1,0) in P?, so this point must have multiplicity 3. 


Example. Consider the cubic f(z,y) = mz? where f = a(x — ayy)(z — agy)(z — 
a3y) and the line x = a,y for i € {1,2,3}. These intersect at (a;,1,0), which are all 
triple points of intersection. 

Now suppose that C is a non-singular curve. A divisor is a formal sum 


D= r c(P)P 


PEC 
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where c(P) € Z and c(P) = 0 for all but finitely many points P. The divisors of C form 
a group, denoted by Div C. For D a divisor, we let 


deg D = > c(P). 


PEC 


Example. 


P, 


P, 


For instance, D = 3P, — 5P; is a divisor with deg D = —2. 

Consider the affine line, i.e. the curve C which is the affine line. The rational 
functions on C are r(x) = a(x)/b(z) where a,b are polynomials in z. At any a € C, we 
can expand r(x) into a Laurent series, say 


co 


r(x) = > cyz- a)” 


v=m 


where (when r # 0) we may suppose that cm # 0. We say ordgr = m. We put 
ord,0 = +00. 

Consider an affine curve C. A rational function on C is by definition a rational 
function r(z,y) E€ C(z,y) whose denominator is not identically zero on C, with two 
functions r,s considered equal if they coincide on C. 

Now consider a curve C C P?. We would like to define rational functions on C. 
They would have the form 
a(z, y, 2) 
b(z,y,z) ° 
where a,b are homogeneous polynomials of equal degree and 6(z, y, z) is not identically 
zero on C. We will consider two rational functions on C as equal if they coincide for 
every point of C where their denominators are not zero. 


r(z,y,z) = 


Example. Consider the curve y? = f(z). Then r(P) = y is a rational function. 
We may rewrite r as 


r(P)=yt+y? — f(z). 
If the curve is interpreted to lie in P?, then we may write 


gil 4 224-2 _ f(g /z)24 
(p)=¥=¥ y 5 (x/z) 
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where d = deg f. 

Now suppose that r is some rational function on C and P is a point on C. If P 
is non-singular, then there exists a tangent line to C at P. Near P, everything can be 
expressed as a function of a local parameter t at P, as illustrated here. 


t 
t=0 


For points on C near P, the function r may be written as a function of t, and r 
has a Laurent series in t. We let ordpr denote the order of this Laurent series. 


Example. We return to the curve y? = f(r) = a(x — a1)(x — a2)(z — œz). If the 
a; are real, then we could have the following picture. 


ar a3 T 


For P on C, take r(P) = y as in the previous example. Then r vanishes only at 
P; = (a;,0). Since the tangent lines are vertical, the local parameter is y itself, and 
ordpar =1 (2 =1,2,3). 


Example. Consider the above curve with r(P) = z — a for P = (x,y). The 
function r vanishes at P, = (1,0). There the local parameter is y. So we need to find 
the Laurent series for z — a, in the variable y. We have z — a, = coy? +eay4 +... , 
and so ordp,(x — ay) = 2. 

So far, we have only considered ordpr for non-singular affine points. By homoge- 
nizing the curve and the rational function r, we may consider non-singular projective 
points as well. 


Example. Return to the curve y? = f(z) = a(x — a1)(z — a2)(xz — a3) and the 
rational function r(P) = y. We homogenize to y*z = a(z — a1z)(z — a2z)(z — a32) 
and r(P) = y/z. Consider the point at infinity Pæ = (0,1,0). Then ordp,(y/z) = 
—ordp,,(z/y). We consider the affine curve with y = 1, that is, z = a(z — ay2z)(z — 
a2z)(x — a3z), and the corresponding point at infinity Peo = (0,0). We can now 
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; n x ; a 
determine ordP,.(z). If g(x, z) is the polynomial of this affine curve,then 52 = 0, #4 = 1 


at Poo. 
z 


So z is a local parameter, and we can expand z = 732° +.... (See the figure 
above.) Then ordp,, z = 3 and ordp, (y/z) = —ordp„ (2) = —3. 
Combining this with what we saw above, we have 


1 if P = P;, 
ordpy= < —3 if P = Po, 
0 everywhere else, 
and we see that 
5 ordpy =0. 
PEC 


This is, in fact, true in general. 


THEOREM. If r is any non-zero rational function on a non-singular projective 


curve C, then 
3 ordpr=0. 
PEC 
This theorem will not be proved here. See e.g. Deuring (1973). Earlier, we had 
introduced the group of divisors, Div C, for a non-singular curve C. Given a non-zero 
rational function r on the curve C, we associate a divisor by 


Divr = J (ordpr)P . 
PEC 
Then by the theorem, deg(Divr) = 0. 


Example. For the curve y? = f(x) and r(P) = y, as above, we have Divy = 
Pi + Po + P; —3Po.- 

A divisor D is called principal if D = Divr for some rational function r. We have 
Div(rs) = Div(r) + Div(s). We have the following inclusions. 


group of divisors 
Ul 
group of divisors of degree 0 
Ul 


group of principal divisors. 
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Now let D, E be two divisors on a non-singular curve C. Say 


E=Ņ (PP, 


PEC 


and D as above. We write D 2 E if c(P) 2 c*(P) for every P € C. This gives a partial 
ordering on the group of divisors. For a divisor D, we write £(D) to denote the set of 
rational functions r with 

Divr 2 — D. 


Example. Let C be the z-axis and P, = 1,P} = 2. Let D = 2P, — Pz. Then 
£(D) consists of all rational functions r with Divr 2 — 2P, + Pz. This allows a pole of 
order at most 2 at P, and requires a zero of order at least 1 at P2. Then, since no pole 
is allowed at oo, £(D) consists of all r of the form 


(z — 2)(az + b) 


IF (a,b E ©) 


which shows that it is a vector space over C of dimension 2. 

In general, £(D) is a vector space over C. We let £(D) = dim &(D). It isa 
consequence of the Riemann-Roch Theorem (see Deuring (1973)), there is a unique 
non-negative integer g = g(C) such that if deg D > 2g — 2, then (D) = (deg D)—g +1. 
This g is called the genus of C,, and it turns out to be the same as the topological genus 
which we mentioned earlier. 


Example. Let C be the z-axis. Then it is easily seen that g = 0. For P}, P2 on 
C, let D = 2P, — P, so deg D = 1 > 2g — 2. Then &(D) = 1—0+4+1 = 2, as we had 
determined earlier in a special case. 

If D,E are divisors, we say D ~ E if D — E is a principal divisor. Notice that for 
D, E to be equivalent, it is necessary that deg D = deg E. 


Example. Let C be the z-axis. If P,Q are any two points on C, then (P) ~ (Q). 
For suppose that P = a, Q = £, and both are finite. Then take r(z) = (z — a)/(z — P). 
Or if P= a, Q = œ, then take r(z) = z — a. 

In fact, for any curve C with g = 0, two points P,Q on C are equivalent. To see 
this, let D = P-Q. Then deg D = 0 > 2g—2, and we have £(D) = 1 by the consequence 
of the Riemann-Roch Theorem which was mentioned above. Hence there exists an f 
with ordpf 2 1, ordgf 2 —1, and ordrf 2 0 otherwise. Since ` pec ordrf = 0, the 
inequalities are all equalities. Then D = Div f and (P) ~ (Q). 

Now consider the case where g = 1. Take D = (Q). Then degD = 1 > 2g —2 
and then (D) = 1. Up to multiplication by C, there exists one function r which has 
at most a pole at Q. In this case, £(D) consists only of constants, so (P) ~ (Q) is the 
same as P = Q. 


LEMMA 6A. Let C be a nonsingular curve with genus g = 1. Let O be a point 
on C. Given a divisor D with deg D = 0, there is a unique P € C with 


(P) - (0) ~ D. 
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Proof. Let D' = D + (O) so that deg D' = 1 > 0 = 2g — 2. By the Riemann-Roch 
Theorem, @(D’) = deg D' — g + 1 = 1, so there exists a function f with Divf 2 — D’. 
Since deg Divf = 0 = 1—1 = 1+ deg(—D’), there is a point P with Divf = —D'+(P) = 
—D-(O)+(P). But then (P)—(O) ~ D. The point P is unique, for if we had solutions 
P and P’, then (P) ~ (P’) and thus P = P' by previous work. 

Let C be a nonsingular curve of genus 1, and let O be a point on C. Then C 
together with (O) is called an elliptic curve. Let Do be the group of divisors of degree 
0 and D, the subgroup of principal divisors. By the preceding lemma, there is a 1 — 1 
correspondence between the points P on the curve and elements of the factor group 
Do/Dp. (This factor group is called the Picard group). Namely, P corresponds to the 
class of (P) — (O) modulus D,. Since Do /D, is a group, this induces a group structure 
on C. Let P) + Pz be the sum of points, as defined in this way. Then 


(Pi + P2) — (O) ~ (Pi) — (O) + (P2) - (0). 
Thus P, + P is the unique point with 
(Pi + P2) + (0) ~ (Pi) + (P2). 


The point O, which is called the base point, is the zero element of the group. Concerning 
the sum of n points, it is immediate that 


(Pi +... + Pa) — (0) ~ (Pi) — (0) +... +(Pa)-— (0). 


Our group law depends on the choice of the base point, but at any rate the group 
is isomorphic to D)/D,. Given points O, O', the canonical isomorphism g between the 
elliptic curves C(O), C(O’) with respective base points O, O' is given by 


(P) — (0) — (9(P)) - (0'). 


It may be shown that a nonsingular cubic curve has genus 1. Given a cubic curve 
C and two points P;, Pz on C, we would like to find a function f with 


Divf = (Pi + Po) + (O) — (Pi) — (P2). 


Then f has a zero of order 1 at P, + Pz and 0 and poles of order 1 at P}, Pz. For 
P, # Pz, we may have the following picture. 
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(Determine P,P} as the point of intersection of C with the line £ through P,, P2. Then 
determine £' as the line through O, P, Pz. Finally P, + Pz is the point of intersection 
of C and £’.) If £ is given by the linear form L(z) = 0 and £’ is given by L'(z) = 0, 
then f(z) = L'(z)/L(z) has the desired properties. 

In the case where P, = Pz (= P, say) we take £ to be the tangent line to C at P. 
Again we have f(z) = L'(z)/L(z). 


156 


In the case where P; = O, take £ = £', and f(z) = 1. 


Py =P,4+0 


P,O 
£=Lf!' 
We may also use this graphical technique to find —P for P on C. We draw the 


tangent line to C at the base point O. Call its third intersection point R. Then draw 
the line through P and R. Its third intersection point is —P. This is illustrated here. 


In the special case where O is a triple point on its tangent line, we have O = R 
and the picture simplifies. 
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-P 


In this case we have another nice result. We have P+ Q +8 =O if and only if 
P,Q,S are colinear. 


Let C : y? = f(z), where f is a cubic with distinct roots. Then we take O = (0, 1,0), 
which is a triple point of the line at infinity. Since the lines through this point at infinity 
are simply vertical lines, our picture looks like this. 
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Now suppose that our elliptic curve is a cubic of genus 1. Suppose also that it is 
defined by an equation with rational coefficients and that the base point O is rational. 
If P,Q are rational points on the curve then PQ is rational. Then, since O is rational, 
we have P + Q rational. Thus given a rational point on our curve, we can generate 
other rational points. 

Let E(C) denote the group of all complex points on an elliptic curve E. Let E(Q) 
denote the group of all rational points on Æ. Then E(Q) is a subgroup of E(C). 


THEOREM (Mordell-Weil). The group E(Q) is finitely generated. 
The theorem as stated here is actually due to Mordell (1922), while Weil has gen- 
eralized it. By the theorem, we know that 


E(Q) ~Z... Z @ Torsion, 


—r times— 


where r is the rank of E(Q), and the torsion part is finitely generated. Curves with 
rank as high as 14 are known. The conjecture is that there exist curves of arbitrarily 
large rank. There are, on the other hand, only finitely many possibilities for the torsion 
part. 


THEOREM (Mazur). There are exactly fourteen groups which may arise as the 
torsion part of an elliptic curve. 

Let E C P?(Q) be an elliptic curve. Then every point on E may be represented as 
(x(P), y(P), z(P)), where z(P), y(P), 2(P) € Z are relatively prime. This representation 
is unique up to sign. The Mordell- Weil height is defined by 


ho(P) = log(max(|z(P)|, ly(P)I, lz(P)I) 
The Neron-Tate height is given by 


h(P) = lim hot?) . 


THEOREM. The limit above exists, and h(P) = 0 if and only if P is a torsion 
point. If P is non-torsion, then 


ho(P) $ c(E)h(P), 


where c(E) is a constant depending on E. Furthermore, h(P) is a quadratic form on 
E(Q), where a quadratic form is defined as below. 

A quadratic form on an abelian group G is a real-valued function f such that for 
any P,Q in G we have 


f(P +Q) + f(P - Q) = 2f(P) + 2F(Q) 


The theorems generated in this section will not be proved here. For more on elliptic 
curves, see Silverman (1985). 
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Exercise 6a. Let P,,... , Pk be in the abelian group G. Then 


k 
f(r P +... + ng Pr) = > ijn j, 


i, j=1 


where the coefficients a;; depend on P},... , Pr. 


§7. The Rank of Cubic Thue Curves. 


Consider the cubic Thue equation 
F(z,y) =m, 


where F is a homogeneous cubic polynomial in two variables with no multiple factors. 
The genus g = 1. This equation has the homogeneous form 


F(z,y) = mz’. 


If the corresponding elliptic curve contains at least one rational point, then we can study 
the group of rational points Em(Q). 


THEOREM 7A. Given any such F, there is an integer mp > 0 such that 
rank Em (Q) 2 1. 


Our proof, which comes later, will follow the work of Silverman (1983). 


PROPOSITION 7B. The group of rational points on the curve z? + y? = 6572° 
has rank 3. 


This special result is not proven here. See tables compiled by Stephens (1968). The 
following result of Silverman (1983) shows that there exist Thue equations with rank at 
least 4. 


PROPOSITION 7C. For certain cubics F and certain values of m, we have 


rank Em(Q) 24. 


What about the torsion part of E(C)? Given a natural number n, there are n? 


points P € E(C) with nP =O. 


LEMMA 7D. Suppose E(Q) is as above. Given any integer d 2 1, there are only 
finitely many points in E(C) which are torsion points and whose coordinates generate 
an algebraic number field of degree no greater than d. In other words, the set 


U E ( L ) torsion 


L number field 
[LQ $ d 
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is finite. 

The proof follows Silverman (1983), but first we need some preliminaries. Suppose 
E(Q) and a prime p are given. By reduction of the curve modulo p we mean the 
following. Consider the corresponding equation f(z,y) = 0, which is an equation over 
Z, and reduce the coefficients modulo p to obtain the new equation f(z, y) = 0 over the 
finite field F, with p elements. We say that we have a good reduction if the new curve 
is also an elliptic curve. In the previous section, we considered the equation y? = f(z), 
where f was a cubic polynomial over Z with distinct roots. In this case if Af £ 0, we 
have a good reduction. We also considered the cubic Thue equation F(z, y) = m, with 
certain restrictions on F. Here, if AF Æ 0 and m # 0, we have a good reduction. In 
these cases, if p is a sufficiently large prime, we will get a good reduction. This is in 
fact true in general. 

Now let P = (x,y) be a rational point on E. If neither z nor y has a factor of 
p in its denominator, then consider 7,7. We have f(Z,y) = 0. If p does occur in the 
denominator of either z or y, then homogenize the point P to get (x,y,z) € Z? with 
gcd(x,y,z) = 1. Then take the point (Z, 7, Z). 


Example. Consider the equation 432° — 2y? = 1, the point P = (1/3,2/3), and 
the prime p = 3. Homogenize to get P = (1,2,3) on the curve 43z3 — 2y? = z?. Then 
P = (1, —1,0) satisfies z? + y? = 2°. 


Proof of Lemma 7D. Let E(Q) and d2 1 be given. Choose a prime p such 
that E has good reduction at p. Let L be any field with [L : Q] = d, and choose a 
prime ideal P of L lying above p. Take the curve E(L) to be the set of all points on 
E with coordinates in L. Reduce this mod P to get E(Fp ), where Fp = O0/P is a 
finite field and D is the ring of integers in L. From algebraic number theory, we have 
card $y < pt. So we have a map 


E(L) — E(B ), 


where E(Fq ) is the set of all points on the reduced curve with coordinates in Fp . 
Now let E(L), denote the “prime to p torsion” of E(L) consisting of points P with 
the property that mP = 0 for some integer m with p} m. Then the map 


E(L)p — E(B ) 


is injective. (For a proof of this fact, see Silverman (1985, p.176).) We know that 
E(Sp ) C P?($p ), so we have 


card (E($p )) $ (card Sp )? + (card gy )+1. 
Using the injective map above and the bound on card (y ) gives a bound 
card E(L), Í B(p,d, E). 
Given another prime g with good reduction, we have 


card E(L), £ B(q,d, E). 
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But 
Tor E(L) C E(L)p + E(L)q, 


so then 


card (Tor E(L)) £ B(d, E). 


Then for P € Tor E(L), we have mP = 0 for some integer m with m £ B(d, E). Here 
m may be different for each point P, but for fixed m, the number of such points P is 
no greater than m?. So the total number of such points P, i.e. for any m £ B(d, E), is 


$ (B(d, E)!)’, 


and the result is proved. 
Before giving the proof of Theorem 7A, we state one further lemma. 


LEMMA 7E. Suppose F(z,y) = mz? is given where F(z,y) = a(x — ayy)(x — 
azy)(t — a3y). The points (a;,1,0) (i = 1,2,3) are triple points on the curve. 


Proof. The lemma is true, since on the line z = a;y, the only point on the curve 
has z = 0. 


Proof of Theorem 7A. Given F, start with the curve 


E: F(z,y) =2°. 


(a ’ 1, 0)' 
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As illustrated, the point P?! = (t,1, 3/F(t,1)), witht € Z, */F(t,1) # 0 lies on E. 
Now take the new curve 


Ex: F(z,y) = F(t,1)2*, 


(ai, 1,0) 


P, 


which contains the point P; = (t,1,1). We have a mapping 


E — E; : (x,y,z) — (x,y, z/ Y F(t,1)), 


and it is a group isomorphism if O = (œ1,1,0) is the base point. 

We consider E as a curve over K = Q(aı). Then Pj has coordinates in a cubic 
extension of K. By Lemma 7D, for all but finitely many t, the point P/ is non-torsion 
on E. So except for finitely many t, the point P, is non-torsion on Ej. 

Unfortunately, the base point O is not defined over Q, but we want to consider 
rank Em(Q), where m = F(t,1). So we take P, to be the new base point of E+, and let 
Q: be the third point of intersection of the tangent line to Es at P;. Now P;, Qi € E:(Q), 
and with respect to this new base point, we claim that Q: is non-torsion. 

For suppose that Q: were torsion with respect to the base P;. Then nQ: = O for 
some integer n, which we write as an equivalence 


n(Qt) ~ n(Pr). (7.1) 


Since the original base point O is a triple point, we have Q; = —2P,, in the group 
operations with respect to base O. We write this as 


(Qt) + 2(Pr) ~ 3(0). 


Then 
n(Qt) + 2n( Pr) ~ 3n(O). (7.2) 


Combining equivalences (7.1) and (7.2) gives 
3n(P:) ~ 3n(O), 


so P, is torsion with respect to the base O, a contradiction. 
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88. Lower Bounds for the Number of Solutions of Cubic Thue Equations. 


In (1933), Chowla studied equations of the form 


z? — ky? =m. 


Suppose k Æ 0 is given and let Z(m) denote the number of solutions to the equation. 
Then Chowla proved that 
Z(m) = O,(log log m)' 


as m — oo. In other words, 
Z(m) > cx loglogm 


for infinitely many values of m. Mahler (1935) studied the more general cubic Thue 
equations 


F(x,y) =m. 


He showed that 
Zp(m) = Qp((log m)'/*). 


The latest result, which we state here, is due to Silverman (1983). 


THEOREM 8A. Suppose F(x,y) is a form of degree 3 without multiple factors. 
Suppose there is some integer mo # 0 such that F(x,y) = mo has a rational solution 
and that the corresponding elliptic curve has Mordell-Weil rank R > 0. Then if Zp(m) 
is the number of solutions of F(x,y) = m, we have 


Zp(m) = Ap((log m) C+), 


In the preceding section, we showed (Theorem 7A) that there always exists an mo 
with rank R 2 1, which gives the following result. 


COROLLARY 8B. For any cubic form F as above, we have 
Zp(m) = Qr((log m)*/3). 
COROLLARY 8C. Suppose F(x,y) = z? + y*. Then we can use mp = 657 and 


R = 3, so 
Zr(m) = Q((logm)*/*). 


COROLLARY 8D. For certain cubic forms F, we can find mg with R 2 4, so 
Zr(m) = Qr((log m)?/*). 


Corollaries 8C and 8D follow from Propositions 7B and 7C, respectively. 


tGiven f(m), g(m) > 0, we say f(m) = O(g(m)) if there exists a constant e > 0 such that 
f(m) > cg(m) for infinitely many values of m. 
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Remarks 
(i) Nothing like this is known for deg F > 3. It is quite possible that in this case there 

are bounds which are independent of m. It is also possible that Siegel’s conjecture 
(Ch. III, first paragraph of §7) is true for curves of genus g > 1. 

(ii) The function Zp(m) counts primitive as well as non-primitive solutions of F(z, y) = 
m. If Pr(m) denotes the number of primitive solutions, then for all that we know, 
it is possible even in the cubic case that Pr(m) is bounded independently of m. 

(iii) The numbers m in the proof of the theorem will be of the type m = mol? where £ is 
large. It is conceivable that the number of solutions is bounded as m runs through 
cube-free integers. 


Proof. Let Eo be the curve given by the homogeneous equation 


F(z,y) = moz’ . 
For points P on Eo, write (z(P),y(P),2z(P)), where z(P),y(P), z(P) are co-prime in- 
tegers. By hypothesis, this curve has Mordell-Weil rank R > 0. So there exist points 
P\,...,Pr on Eo(Q) which generate a free abelian group of rank R. Let an integer 
N 2 1 be given and consider the set G(N) of points 


ny Py +...+nrPr, 


with 
O0<n; SN (i=1,..., R). 


Then card G(N) = N® and all of the points of G(N) lie on the curve Ep. 

If the base point O is chosen appropriately, then all of the non-torsion points P 
will have z(P) # 0. This is true if no root a; of F(z,1) is rational. If some root 
a; is rational, then put O = (a;,1,0), which is a triple point of intersection (of the 
curve and its tangent at O). Then if z(P) = 0, we have f(z(P),y(P)) = 0, and thus 
x(P) = a;y(P) for some root a;. Again P is a triple point of intersection. Then 3P = O, 
a contradiction. 

Now that we know z(P) # 0 for the non-torsion points, we put 


(8.1) m=m(N)=mo I] oR 
PES (N) 


We know that 


for P € G(N), and therefore 


a) z w z =m z 3 =m 
apy ll +), AP) oll o) o J] 40 =m. 


6 (N) QES (N) 


Therefore we have at least NË integer solutions to f(z,y) = m. 
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Thus we need to obtain a lower bound for NE in terms of m. For a non-torsion 


point P = (2(P),y(P), z(P)) we have 


log |2(P)| S ho(P) 


= log(max(|z(P)I, ly(P)I, le(P))) 
<S ch(P), 


where ho(P) is the Mordell-Weil height and A(P) is the Neron-Tate height. Notice that 
the last inequality holds since P is non-torsion. For the point P = nı Pı +...+nRPrp, 


we have 
R 


log |z(P)| Sc > Cij Ni nj 


ijl 
< (n? +... +73) 
< c** N?, 
since h(P) is a positive definite quadratic form in n1,...,nR. Here c and the cij may 


depend on P;,... , Pr, thus c** may depend on these points as well. By the definition 
of m in (8.1), we also have 


log |m| = log |mo| + 3 5D log |z(P)| 
PES (N) 


< 3c** NEH? + log |mol. 
Combining this with our previous estimate for Zp(m), we get 
Zp(m) 2 NF 2 c***(log |m|)*/(2 +2), 
as desired. 
Remark. Since, as we have seen in Theorem 1C of Ch. III, the number of solutions 


of |F(z, y)| £ mis of smaller order of magnitude than m, the average number of solutions 
of F(z, y) = m, as m varies, is zero. 


§9. Upper Bounds for Rational Points on Certain Elliptic Equations in 
Terms of the Mordell-Weil Rank. 
Consider the Neron-Tate height h(P) of points P € E(Q). We remarked in section 
6 that 
(i) h(P) > 0 if and only if P is non-torsion 
(ii) ACP) is a quadratic form. 
Suppose that Q is a torsion point. Then for any point P, and for n,m € Z, 


h(nP + mQ) = an? + bnm + cm’. 
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If LQ = 0 where £ is a positive integer, then h(nP + mlQ) = h(nP), so that bnmé + 
cm? l = 0 for m € Z, and therefore b = c = 0. In particular, h(P + Q) = h(P), or in 
other words, h(P) = h(P') if P — P' is a torsion point. Therefore h(P) is defined on 
the factor group E(Q)/Torsion. 

Say rank E(Q) = R and 


E(Q) = ZP, ®...@ ZPr © Torsion. 


Then h(n Pi +... +nerPr) = Bosi aij ni nj. By (ii), the quadratic form on the 
right here is positive if nı,... npr lie in Z and are not all 0. In fact, it is known that 
this quadratic form is positive definite, i.e. it is positive if n1,... np lie in R and are 
not all 0. It is easily seen that this property does not depend on our choice of the base 
points P,,...Ppr, and it is usually expressed by stating that 
(iii) h(P) is a positive definite quadratic form on E(Q)/Torsion. 

A consequence is that 


h(nıPi +... + nRPR) 2 a(n? +... +n?) 


where cı > 0. The number of integers n;,... npr with h(ni Py +... + npRrPR) S £ is 
< ec, €%/2 +1. From Mazur’s Theorem, card (Torsion) < 16. As a consequence, the 
number of points P € E(Q) with h(P) < € where € Z 1 18 


S cg(E) EF, 
Now we will consider elliptic equations of the form 
y? = z’ + D 


and 
Emp : y? = z’ + EmD, 


where D is given and m varies. Recall that ho denotes the Mordell-Weil height. 


THEOREM 9A. Let D # 0 and £ 2 1 be given. If m 2 mo(D) and m is sixth 
power free, then the number of P € Emp(Q) with 


ho(P) < £ logm 


is less than 
16- (c4/é)Fank Emp(Q , 


where cq is an absolute constant. 
In earlier sections, we used the Wejerstrass type equation 


y? = f(z), 


where f(a) is a cubic polynomial. Here it will be necessary to use the more general 
Weierstrass form 


y’ + airy + azy = z? + azz? + a4z + ag. 
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It is easy to see that we may transform the second form into the first by the change of 
variables 


y' =y + (2/2) + (a3/2). 


In general, transformations of the form 
(9.1) you y +u'se’ +t, caus! +r, 


where u Æ 0, will transform a generalized Weierstrass equation into another equation of 
the same type. In fact, it can be shown that these are the only “rational” transforma- 
tions with this property. We say that two Weierstrass equations over Z are equivalent if 
they are related by a transformation of the type (9.1). 
Each general Weierstrass equation W is equivalent to a Weierstrass equation y? = 
f(x). We then set 
A(W) = 16 discr. (f); 


it is easily shown that this quantity depends on W only. 

If W has integral coefficients, f no longer necessarily does, but it may be shown 
that A(W) will be integral. Among all equivalent equations with integer coefficients we 
may pick one where |A(W)| is minimal. This discriminant is called Amin, and it may 
be associated with a Weierstrass equation of the more general form. One may check 
the algebra to see that A(W) = u??A(W’) for W, W' related by a transformation as 
above. Thus if some A(W) is not divisible by any twelfth power, then we know that 
Amin = A(W). 


Example. Consider our equation y? = z? + D. Then (see below) A = —16-27D?. 
If W : y? = z? + a°b, then A(W) = —16- 27a!7b?. Now let y = a*y', z = a?°x'. Then 
we have W' : y”? = x” + b, and A(W') = —16 - 276. 
LEMMA 9B. Consider the curves 
Ep: Y} =z? 4D 


and 
Emp: Y =r? +mD. 


If m is sixth power free, then 


log Amin(Emp) 2 2log |m| — 10 log |6 D]. 


Proof. In the example, we said that 


A(Ep) = -16 - 27 D? . 
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We may check this by considering the roots of f(z) = z? + D. They are ay =a, a2 = 
al, a3 = aC?, where a = %V—D and ¢ is a primitive cube root of unity. Then 


(a1 — a2)(a1 — a3)(az — az) = a°(1 — ¢)(1— CXE- C’) 
=-D(1 -PA +e 
=D- 0) 
= D(1-2¢+¢7)Q-¢) 
= —D(3¢)(1 — ¢) 
= —3D(¢ — ¢?) 
= —3D(2¢ +1) = -3V3: D, 


since C = (—1 + iv3)/2. Then discr f = —27D?. We also have 
A(Emp) = —16 - 27m? D’. 


For any transformation allowed, we would have A = ¢!*A', where £ € Q*. So we 
get 
—16- 27m? D? = A(Emp) = L’ Amin (Emp). 


Write m = mmz, where m is a product of primes p with p{ 6D and mz is a product 
of primes p with p | 6D. Since m is sixth power free, we have 


m? | Amin (Emp) 


and 
m2 < (6D)° A 


Then 
Amin(EmD) 2 m? ’ 


and the result follows by taking logarithms and applying the inequality for m2. 


LEMMA 9C. Consider 
Ep: y=2°+D. 


If P is a non-torsion point on Ep, then 
h(P) > C5 log Amin(Ep), 


where cs > 0 is absolute. 


This result is due to Silverman (1981). Lang has conjectured that the result is true 
for more general elliptic curves in Weierstrass form. 


LEMMA 9D. Let ho denote the Mordell-Weil height and h the Neron-Tate height. 
Then for P on a curve E in general Weierstrass form, 


|h(P) — ho(P)| S ce log [A(E)]. 
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This result is due to Zimmer (1976). 


Proof. (of Theorem 9A.) Recall, we want to count the number of P € Emp(Q) 
with 
ho(P) < Elogm, 
where € and D are given. By Lemma 9D, we know that this is less than or equal to the 
number of P € Emp(Q) having 


h(P) < Elogm + cg log A(Emp). 


Write P € E(Q) as P = P' + P" where P” € Torsion and P! = nP +...+nrPr. By 
Mazur’s Theorem, we have card (Torsion) < 16. Combining this with Lemma 9B, we 
see that the number of P being counted is no greater than 


16-card{P’ € Emp(Q) : h(P’) < £logm + 2c¢ log m + c7(D)}. 


Now we are back to a problem in the Geometry of Numbers. We know that h(P’) 
is a positive definite quadratic form F in R variables, and we need to count the number 
of points P’ with h(P') < v = logm + 2ce log m-+c7(D). We also have for nonzero P': 


H(P') 2 cs log Amin(Emp) = 1, 


say, by Lemma 9C. We use exercise 2b of Chapter I. Let K be the set of all z with 
F(z) S 1. We know that every integer point z # 0 has F(z) Z vı. So the first 
minimum À; satisfies 4, 2 \/7j. We count the number of points z with F(z) S v, ie. 
the number of integer points in the set vk. By the exercise, the number of such points 


18 
7 R 
= (2/7 +1) 
Vy 


Since for m 2 mo(D) we have v < (€+2cg+1)logm < cg £ log m, and v 2 cg logm, 
we have ,/v/14 S cio €1/?, so that we obtain 


S (2c10 E? +1)” S (cg €1/?)* . 


Theorem 9A follows. 


$10. Isogenies. 


Let C,,C2 be curves in C?. We want to discuss maps from C; to C2. A rational 
map is a map given by rational functions ¢,~ on Cı such that whenever (x,y) € Ci 
and ¢,~ are defined at (x,y), then (¢(z,y), P(z,y)) € Co. 

However, it is better to think of C,, C2 as curves in P?. Then we consider maps of 
the form 


(x,y, z) i (¢1(2, Y, z), ġ2(2, Y, z), $3(z,y; 2)); 
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where $3, ¢2, #3 are homogeneous polynomials of equal degree such that for (z, y, z) € Ci 
we don’t have ($1, $2, ¢3) identically (0,0,0), and we have (¢:(z, y,z), ¢2(2,y,2), 
$3(t,y,z)) E€ C2 whenever (¢1, ¢2,¢3) Æ (0,0,0). More precisely, we consider equiva- 
lence classes of such functions (¢1, 42, ¢3). One says 


(41, ¢2, $3) ~ ($i bo) $3) 


if the matrix 


( (z)  $2(z) $3(z) 


piz) p2) %8) 
has rank S 1 for every z € C1. 


Example. Let 
Cı: P! and Co: 2 +y =l. 


oae 2 -1 
t2 4+1’ #241 


(t,w) (2tw, t? — w?, t? + w?). 


The rational map 


can be viewed as the map 


Example. Let 
Ci: 2t+yt=1 and Cz: +v =l. 
A map which takes C into C} is 


(z, y) oe (Cae y’) = (u, v). 


A non-constant map has a degree 6 which is defined as the largest integer such that 
6 points on Cy are mapped into a single point on C2. In the first example, one could 
find an inverse map, so 6 = 1. In the second example, 6 = 4. 


Fact. If C1,C2 are non-singular curves in P?, then a rational map Cı —> C3 is 
necessarily defined everywhere. If C;, C2 are irreducible and the map ¢ is not constant, 
then ¢ is onto C2. (See, e.g. Silverman (1985), Ch. II, Prop. 2.1 and Theorem 2.3). 

An isogeny of elliptic curves E1, Ez with respective base points O4, O2 is defined 
as a non-constant rational map ¢ : E, — E with ¢ (01) = O2. By what we just said 
above, such a map is onto. T ~ 


THEOREM 10A. An isogeny is a homomorphism of the groups belonging to E 
and Ez . 
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How would one begin to prove this? We have 


(P+ Q) + (01) ~ (P) + (Q), 


where ~ is an equivalence relation on the divisors of E,. We would need 
(E(P + Q)) + (6(02)) ~ (6 (P) + (60), 


where ~ is now the equivalence relation on divisors of E2. So we have a rational function 


f on E such that 
Div f = (P) + (Q) — (01) - (P + Q), 


and we need a rational function f, on Ez with 
Div f. = (6(P)) + (6(Q)) ~ (£ (02)) - (4 (P + Q)). 
If : 
D= 2a 


is a divisor of E4, then let 
£ 


¢ (D) =J cl (P) 


i=1 


on E2. So it would suffice if with every rational function f of E1, we could associate a 
rational function ¢ (f) = fx on Ez such that 


Div (4 (P) = $ (Div f). 
It is proved in Algebraic Geometry that this can be done. (See Silverman (1985), Ch. 


II, Prop. 3.6). 
Now suppose the functions defining ¢: Ey ++ E have rational coefficients. Then 


g: E,(Q) w$ E,(Q). 


Since ¢ is a group homomorphism, it is precisely 6 : 1 where ô is the degree, and the 
kernel is of order 6. It is a general fact that a homomorphism with finite kernel of free 


abelian groups preserves the rank. 
Consider the special case of a binary cubic form 
F(z, y) = az? + bz?y + cry? + dy? 


with distinct linear factors. Take the curve 


Cm: F(x,y) =m. 
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Exercise 10a. Consider binary cubic F forms with complex coefficients and non- 
zero discriminant along with non-singular linear maps T : Œ — C?. Let Fo(z,y) = 
z? +y°. Then any such F is of the form F(z) = Fo(T(z)) for a suitable T. 

Given a binary cubic form E 


F(z,y) = az? + bz?y + czy? + dy’, 


associate with it 


1 |F; F, 
G(z,y) = = Ir zy 
ey) 4 |Fys Fy 
= (3ac — b*)z? + (Qad — be)zy + (3bd — c?)y? 
and 
F, F, 
Haen & 


= (27a*d — Yabe + 2b*)x* — 3(6ac? — b?c — 9abd)z? y 
+ 3(66?d — bc? — Yacd)xy? — (27ad* — 9bed + 2c*)y?. 


The new forms G, H are called covariant forms. We also introduce the notation 


FT (z) = F(T(z)). 


If 
F= FT, 
then 
G — (det TPG" 
and 


H > (det TŻ HT. 


Remark. This is easy to check, using the two special transformations with matrix 
0 1 l a 
(ap i- ( 4): 


Jm: y*z = r? — 432m? D23, 


Consider the curve 


where D = discr F and F is as above. We have a map 


¢: Cm Jm: (x,y,z) > (—42G(2,y), 4H(z,y), 2°). 


To see that such ¢ really maps into Jm, look first at the special case where m = 1 and 
F(z, y) = z? + y?. We have 

C:2+4y=2% 

J: yz = z? — 432(-27)z%, 
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since D = —27. Then F = Fy = z? + y? and G = 9zy and H = 27(z? — y*). We need 
16H?2* = —642°G? + 432 - 272°, 


i.e. we need 


H? = —4G° + 27728, 


which does hold. The general truth follows, since we have F = FJ in general and G, H 
have the necessary covariance properties. 

Under this map, a point on Cm with F(z,y) = 0, z = 0 will be mapped onto 
(0,4H (x,y), 0) = (0,1,0) since H(z,y) £0. Also, if Cm has any rational point, say O1, 
then ¢ (01) = O2 will be rational on Jm. Now ¢ defines an isogeny from Cm with base 
point O: to Jm with base point O2. i 


§11. Upper Bounds on Cubic Thue Equations in Terms of the Mordell- 
Weil Rank. 


As in section 8, we consider curves of the type 
Cm: F(z,y) =m, 


where F is a form of degree 3 with no multiple factors. If Cm contains no rational 
points then we have a trivial upper bound. So suppose that Cm contains a rational 
point. Then we have an elliptic curve with some R = rank Cm(Q) = R, say. 


THEOREM 11A. When m > mo(F) and m is cube-free and Cm contains a 
rational point, then the number of integer solutions of the cubic Thue equation F(x,y) = 
m is 

< citrank Cm(Q) | 
where c is an absolute constant. 


This result is due to Silverman (1982), but there was an earlier attempted proof by 
Demjanenko (1974). Recall that Bombieri and Schmidt have given the bound 


Co gire, 


where v = v(m) is the number of distinct prime factors of m, and only the primitive 
solutions are counted. These two estimates are rather different and, at this point, no 
one has shown how they fit together. (See also C. Stewart (to appear)). 

Suppose there exists some form F such that as m ranges over positive cube-free 
integers, then the number of solutions of F(x,y) = m is unbounded as a function of m. 
If this is so, then rank (Cm(Q)) is unbounded, and this would prove the conjectured 
existence of elliptic curves of arbitrarily high rank. 


LEMMA 11B. Consider the Thue equation 


F(z,y)=m 
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of degree 3 and define 
H*(F) = (cont F)hx(ai) 


as in Chapter III, Section 2. Then the number of solutions to this Thue equation with 
max (|z|, yl) > 4 H*(F) m"? 


is 
& d. 


Proof. By Lemma 3C of Chapter III, we have 


alo, hg(a)t m 
2 ly|? 


for some root a of F(2z,1). If |y| 2 |z|, we get 


44H*(F)4m 1 
y4/8 y3v4/2 y3va/2 ' 


| z 
aie 
y 


By a result of Chapter III, the number of solutions of this last relation with y 2 H*(F) 
is < d. 

Now we are able to prove Theorem 11A. Suppose we have a solution (z, y) € Z? 
of F(z,y) = m. Then (z,y,1) is a point on the curve Cm : F(z, y) = mz*. As in the 
preceding section, we have a map ¢: Cm —> Jm, where Jm : y?z = z? — 432m? D. Let 
ho be the Mordell-Weil height on Jm. Then since ¢ was defined in terms of cubic forms, 


ho (6 (2,¥51)) $ 3log |z| + c1(F). 
By the lemma above, we have 
jz] SERF mA 
with < cz exceptions. If z is not an exception, then 
ho (Ø (2,9, 1)) $ 8logm + ca(F) $ 9logm 


for m 2 c4(F). We apply Theorem 9A with —432D in place of D. Since m is sixth 
power free, the number of rational points P on Jm with ho(P) £ 9log m is 


< hea Im = ge 


since Jm is obtained from Cm by an isogeny. Since ¢ is at most six to one, we get an 
estimate for the number of integer points on Cm. 
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§12. More general results. Our discussion would be incomplete without at least 
a mention of the following deep results. Their proofs, however, are beyond the scope of 
these Notes. 

Siegel in (1929) proved that the number of integer points (z, y) of any irreducible 
curve 


f(z,y) =0 (12.1) 


of genus g > 0 is finite. Baker and Coates (1970), in the case g = 1, gave effective bounds 
for the number and the size of such points. They accomplished this by constructing a 
suitable birational transformation to an elliptic curve y? = f(z). Better bounds were 
recently achieved by Schmidt (to appear): If (12.1) defines an irreducible curve of genus 
1, where f has rational coefficients, then the number of integer solutions z, y € Z is 


< ald) BeA, 


where H is the height of f, where d is its degree, and c2(d) is a polynomial in d. (E.g., a 
polynomial of degree 13, but this can surely be improved. Furthermore, possible integer 
solutions have 


max (izl, ii) < exp (aae), 


Faltings (1983) proved Mordell’s conjecture, that on a curve of genus g > 1, there 
are only finitely many rational points. Another proof, with ideas closer to diophantine 
approximations, was given by Vojta (to appear), with a more elementary version given 
by Bombieri (to appear). There is every hope that this will lead to bounds on the 
number of rational points. However, when g > 1, effective results on the size of integer 
points or rational points (the size of numerators and denominators) seem at present to 
be quite beyond reach. 


V. Diophantine Equations in More than Two Variables. 
References: Evertse, Gyory, Stewart and Tijdeman (1988), Schmidt (1980) 


§1. The Subspace Theorem. 


THEOREM 1A. (Subspace Theorem, Schmidt (1972)). Suppose that L1,... , Ln 
are linearly independent linear forms in n variables with algebraic coefficients. Suppose 
6 > 0 is given. Then the integer points z # 0 with 


[Zi(z)... Ln(z)| < |z|’ 


lie in a finite number of proper subspaces of Q”. 
The reader may find a proof in Schmidt (1980). 


CORROLLARY 1B. Suppose aj,...,@n are algebraic and 1,01,...,a@p, are 
linearly independent over Q. Then there are only finitely many rational n-tuples 
(x1 /y,... ,2n/y) with y > 0 and 


1 


(9.1) FOME’ 


Qœ&i — — 
y 


< (GH lan) 


In the special case n = 1, we get Roth’s Theoerem. Also, the exponent 1 + (1/n) 
is best possible by Dirichlet’s Theorem (Theorem 1B of Chapter II). 


Proof. Multiplying together all of the inequalities in (9.1), then multiplying by 
n+l yi 
yt! gives 
ylaiy — 21... lany — n| < 1/y’. 


Now put z = (z1,... ,2n,y) € Z”*! and let X = (X1,... , Xn, Y). Let 
L(X) = aiY —X; (@=1,...,n) 


and 
Dn+i(X) =Y. 


Then we have 
|La (z)... Lnga(z)| < 1/4 < 1/|z|°? 


if y is large. 
By the Subspace Theorem in n + 1 dimensions, the solutions lie in a finite number 
of subspaces. Let one such subspace be given by 
CTi +... +CnTn +Cn+1y = 0 


with c; € Q. On this particular subspace we have 


(ciai +... + Cnn + cn4i)y = c1(ai1y — 21) +... + en(any — Tn) 
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by the defining equation above. Let y = c10 +...+ Cnn +¢n41- Then y #0 by the 
linear independence of 1,01,... ,&n, and also y is fixed for a given subspace. We have 


lyllyl S (leal +... + len) /y/™*8 
S |a| +... + lenl- 


So y is bounded and we are finished. 

One would like to make the Subspace Theorem more quantitative. Recall, Roth’s 
Theorem is ineffective in the sense that it does not give estimates for z,y. It can be 
strengthened, as we have seen, to give bounds on the number of solutions. A similar 
result is true in this case as well. We can not estimate the coefficients of the defining 
equations of the subspaces (i.e., we cannot estimate their heights), but we can give a 
bound for the number of subspaces. 


THEOREM 1C. (Schmidt (1989a)). Let Lı,... ,Ln be linearly independent 
linear forms with coefficients in an algebraic number field of degree d. Consider the 
inequality 

|Zi(z)...Ln(z)| < |det(Ly,... , Ln)| lgl’, 


where 0 < < 1. Then there are proper subspaces S1,... ,S¢ of Q" where 
lepte 
such that all integer solutions z # 0 lie in the union of S;,... , S+ and the ball 
|z| £ max((n!)*/°, H(L1),... , H(Ln)). 


Schlickewei (1977) generalized Schmidt’s Subspace Theorem to allow more general 
absolute values. 


THEOREM 1D. Let K be an algebraic number field and let S C M(K) be a 
finite set of absolute values which contains all of the non-Archimedean ones. For v € S, 
let Dyi,... , Lon be n linearly independent linear. forms in n variables with coefficients 
in K. Let 6 > 0 be given. Then the solutions of the inequality 


II I] oe < E 
vES i=l 


with z € Ok and z # 0, where 


lie in finitely many proper subspaces of K”. (As always, Ox is the ring of integers in 
K, and n, is the local degree). 
A quantitative result was proved by Schlickewei, (to appear (a). See also (b).) 


178 


THEOREM 1E. Let S C M(Q) be of finite cardinality s, and containing the 
Archimedean absolute value. Let K be a number field of degree d, and suppose that for 
each v € S we are given a fixed extension of | - | to K. For v € S, let Ly,...,Lun be 
n linearly independent linear forms in n variables and with coefficients in K. Consider 
the inequality 


I I] fi @b < (II |det(Le1,.-. Lun) JE. 


vES i=1 


Then there are proper subspaces S;,... ,§, of Q”, where 
t = [85d!)2 "2°" 7} 


such that all the solutions z € Z” lie in the union of S;,... ,S; and the box 


lz] S max ((n!)*/ 6, H(Lvi)). 
veS 


In Theorems 1C and 1E, the reader should note that t, the bound on the number 
of subspaces, is independent of the linear forms. 
The following theorem will turn out to be equivalent to Theorem 1D. 


THEOREM 1D’. Let K,S be as above. For v € S, let Ly; (i = 1,...,n) ben 
linearly independent linear forms over K. Then solutions z € P"—1(K) to the inequality 


a /\Lvilz)le\™ 1 
(a II II ( izle ) < Fixe 


vES i=1 


where 6 > 0, lie in finitely many proper subspaces. 


Remark. It is reasonable to consider z € P”~'(K), since both sides of the in- 
equality are invariant under multiplying z by a scalar. 


LEMMA IF. Any element z of P"~!(K) has a set of coordinates z = (4,... ,2n) 
such that 


(i) |z|» $1 for v non-Archimedean, 


(ii) TE ae 2 t/e(x), 


v non-Archimedean 
(iii) lzl» S c2(K)|z|y for any Archimedean v, w. 
Remark. While (i) bounds |z|, from above for v non-Archimedean, (ii) says that 


|z|y can not be too small. Result (iii) says that all of the Archimedean absolute values 
are about the same. 


Proof. Consider the ideal 3(z) generated by 21,...,2n. There is some integral 
ideal 2 in the same ideal class as J(z), and M(X) S ¢:(K). After multiplying z by some 


179 


à € K, we get a new z' with 3(z') = A. After a change of notation, write I(z) = 
Then 21,... = are integers in K, and so (i) holds. 

If (ayes “hr ... P7 and if (p) = PAF... Q;*, then the absolute value as- 
sociated with “i has ale. = p~”/*, Also if NP) = pf, then ng = ef. (We are 
using rudimentary facts from algebraic number theory). Putting this together gives 


lal” = N(P)~”. So if 
(21) = 113”, 


then 
zl” Js NPT Mins Ys), 


On the other hand, | 
Ig) = I eee sessing) 
j 


so we have 
Mie =] lels: = ROE, 
v non-Archimedean j 
and (ii) holds. 
By Dirichlet’s Unit Theorem, if ol), ... ,a") correspond to real embeddings of K 
into R and a+)... a+r) a+)... œt") correspond to the complex em- 
beddings, then given any Ci,... ,C;,4r,, there exists a unit e such that 


lec; S [eC 0K) 
for 1 S i,j Š rı +12 and some constant co(K). To prove (iii), we need 
[z| S eo(K)|z |, 


and this can be obtained by multiplying z by a suitable unit €. 
We can now see that Theorem 1D implies Theorem 1D’. By the lemma, we have 


1 
< me <1 
aK) É I lle" $1, 


v non-Archimedean 


more generally 
Ry < 1. 
c TA = <I Izlv 


In view of this inequality, (9.2) implies that 


I] [I mor s ee. 


vES i=1 
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By the lemma again, we have 


—d 
|z| <i? 
where =, i 
= VI > k 
izl 1eien |z; | v M en lzlv 
1SjSritre 
Then 


Tl Lf or < ae 


vES i=l 


and Theorem 1D implies that the solutions lie in finitely many proper subspaces. 


Exercise la. Show that Theorem 1D’ implies Theorem 1D. 


§2. General S-unit Equations. 


We now return to, and elaborate, on results stated at the beginning of Ch. IV, §3. 
Let K be a number field and S C M(K) a set of absolute values which contains all 
of the Archimedean ones. First, we consider equations of the form 


zo t+2,+...+¢+ 2, = 0, 


where z; € Us (i = 0,...,n). Evertse (1984), as well as Schlickewei and Van der 
Poorten (1982) have the following result. 


THEOREM 2A. An S-unit equation of the form 
ot@at+...+ 72, =0 


has only finitely many solutions z € P"(Us) for which no non-trivial subsum vanishes, 


i.e. for which 
5 zi #0 
1¢. 


when ¢ # I # {0,1,... n}. 


Example. The equation 3*+3°+5°+5% = 0 is an S-unit equation if S = {00, 3, 5}. 
However, we get infinitely many solutions whose subsums vanish. So we see that the 
condition about no non-trivial subsum vanishing is a necessary hypothesis for this result. 

Schlickewei had a slightly weaker version of this result before he obtained the general 
version above. Instead of requiring that no subsum vanish, he had imposed a condition 
about distinct primes in the summands. 
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COROLLARY 2B. Given coefficients ao,... ,@n in K*, the equation 
agro +... + antn =0 
has at most finitely many solutions z € P"(Us) for which no non-trivial subsum vanishes. 


Proof. If S is suitably enlarged, we have a; € Us. Then set z; = ajz;. By the 
theorem, there are only finitely many possibilities for z’ for which no subsum vanishes. 


(Notice that when S is enlarged, the theorem is strengthened since more S-units are 
allowed.) 


Proof (of Theorem 2A). For v ¢ S, we have |z;|, = 1, since z is an S-unit. Then 
by the product formula, we have 


II lest =1 (i =0,...,n), 


ves 


and 
Now let @ = (zo,... ,2n-1). Then 


vES i=0 = 


For each v € S, choose i(v) in 0 < i(v) S n — 1 such that 
\Z|» a |ricvylv, 


and restrict to solutions where the i(v) for v € S are fixed. There are n°*"45 such choices. 
Let the set of linear forms L,; (1 $j < n) be the set {Xo,X1,...,Xn—-1, (Xot+.-.+ 
Xn-1)}\Xiv). Then 


TT (A) - ae 


and by the Subspace Theorem (version 1D’), the solutions ž lie in finitely many sub- 
spaces. 
Say one such subspace is given by 


Colo +... + Cn-12n-1 = 0. 


Let Jo. be the set of ¢ with c; # 0. Then 


(2.1) Y cx; =0 


iE] o 
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is an S-unit equation. Let J C Jo with J # ¢ and consider solutions of 


(2.2) Y ceo 

iE 
for which no subsum vanishes. For every solution to (2.1), there is such a set J, and the 
number of such sets J is finite. Apply the case n’ +1 = |3| to the S-unit equation (2.2). 
(We are using induction as n). Up to proportionality, there are finitely many solutions. 
So it suffices to study solutions where {z;};¢ 3 is proportional to a fixed {u;}ie 3 , ie. 
where 


T; = Eui (2 € 3). 


Return to the given equation 


which we can rewrite as 
E> u;i) + 5y z; = 0. 
i€3 i¢3 

If Wicy Ui Æ 0, this is an S-unit equation inn +1 —|3|+1=n+2-|3| S n variables, 
namely € and z; (i ¢ J). By induction, we get finitely many solutions for which 
no subsum vanishes. On the other hand, if });¢3 ui = 0, then the subsum J igg 7i 
vanishes as well and we are not interested in such solutions. 

We have proved that there are only finitely many solutions to an S-unit equation 
for which no subsum vanishes. In the case where K = Q, a bound has been given 
recently. Given coefficients ao,... ,an in Q*, the number of solutions z € P"(Us) of 


aozo +...+ Gntn = 0 


for which no subsum vanishes is 
gzent4 6 


S (88), 


This result is due to Schlickewei (to appear). Notice that the bound is independent of 
the coefficients ag,... ,an. The case for general K is done in Schlickewei (to appear 


(d)). 


§3. Norm Form Equations. 

Let K be a number field with [K : Q] = d and L(X) be a linear form, say L(X) = 
L(X1,...,Xn) = aX. +... + anXn with a; € K. Suppose that aj,... „On are 
linearly independent over Q. Then n S d and L will not vanish on Q”\0. As usual, 
denote the embeddings of K into C by a a) (j= 1,... ,d) and write L®(X) = 
aX +...+ al Xa. We write 


d 
N(L(X)) = [| L(x). 
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A norm form is any form F of the type F(X) = aN(L(X)) for some L as above and 
a € Q*. For a norm form F, we have F(X) € Q[X], and F is trivially decomposable 
over the algebraic numbers as a product of linear factors. 
By a norm form equation we mean an equation of the type 
F(z) =m, zez”, 


where F' is a norm form. When d > n this is a generalization of the Thue equation. For 
if n = 2, the linear form L(X,Y) = X — aY gives norm forms 


F(X,Y) =a ll (X —aY). 


i=l 


If dega = d 2 3, then F(z, y) = m is a Thue equation. 

As z runs through Z”, the linear expression L(x) will run through a set M C K. 
This set M is a free Z-module of rank n with basis aj,... ,&n. So we could rewrite the 
norm form equation as 


aNu) =m, 


where u E M. 

Let QM be the set of products qu with q € Q, p E M. Then QM consists of 
042, +...+On%, with z; EQ (i=1,...,n). Let E be a subfield of K and let ME 
be the set of p € M such that 

ABE QM 


for every À € E. Then ME is a submodule of M. If E C E’, then ME’ C ME; and we 
have ML =M. The module M is called degenerate if there is a field E C K with E 4 Q 
and E not imaginary quadratic such that MË 4 {0}. We say that F is degenerate if 
the corresponding M is degenerate. 

Example. Take K = Q(i, V2), which has d = 4, and take L(X,Y,Z) =X +iY + 
i/2Z. Let E = Q(i). Then {z +iy : 2,y€ Z}=M?*. For ifr+iy+iv2z € MË, then 
we would need iz — y — /2z € QM, which forces z = 0. This does not show that M is 
degenerate, though, since E is imaginary quadratic. We could also take E = Q(iv2) to 
see ME = {x + ivV2z} Æ {0}, or take E = Q(V2) to get ME = {iy + iV2z} Æ {0}. So 


we see that Jt is, in fact, degenerate. 


Example. Suppose d = p where p is a prime > 2, and IN is a Z-module of rank 
n, where n < p. The only subfields of K are K and Q, so we just need to consider 
MF. Suppose u € K, p #0. Notice that QM is a vector space over Q of dimension 
n. As à runs through K, then Ay runs through K, a vector space of dimension d. So 
Kp=K 2 QM and thus p ¢ ME. Then M* = {0} and M is not degenerate. 


Example. If n = d, then K = QM and MĚ = M. If K # Q and K is not 
imaginary quadratic, then Mt is degenerate. 

Degeneracy is important, for if F is non-degenerate, then the norm form equation 
F(z,y) = m has only finitely many solutions. (Schmidt (1972) and (1980) Lecture 
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Notes). On the other hand, if F is degenerate, there will exist some m so that F(z) = m 
has infinitely many solutions. Before justifying this last remark, we will consider the 
simplest case which has infinitely many solutions. 

Example. Suppose a,,... ,a@q form an integral basis for K. Take n = d and 
L(X) = 0X1 +... + &aXa. Consider the norm form equation N(L(z)) = 1. Then we 
seek solutions to the equation MN(e) = 1 where e€ = azı +...+QnZn. Thus e is a unit. 
By Dirichlet’s Theorem, we have infinitely many solutions unless K is Q or is imaginary 
quadratic. 

In general, suppose that there exists a subfield E with MË 4 {0}. We claim that 
if A € E, then not only is AM” C QM but AME C QM". For if p € MË, then 
Au E€ QM, so that màu E€ M for some positive integer m. Given \' € E, we have 
A'mÀu E€ QM, since Am € E. This shows that màu € Mt”, thus Au € QM. 

Now let OF, be the set of \ € E with AME C ME. The set OF, has the following 
properties: 

(i) It is a ring containing 1. 

(ii) It contains a field basis of E over Q. For if \1,... , Àe is a field basis, then A;MË C 
4 ME for some positive m € Z. Then mdy,... ,MÀe € DF, form a field basis of E 
over Q. 

(iii) There exists an £ > 0, £ € Z such that ¿DE contains only algebraic integers. For 
if u # 0 is in ME then àp € ME for every A € OF. Then A € Ime. But am” 
is a free module with only finitely many generators, so there exists an £ so that 
r ME contains only algebraic integers. Then 40%, C t ME contains only algebraic 
integers. 


Any subring of E which satisfies these three conditions is called an order of E. The 
set OË of all the integers in E is an example of an order. It is a fact that any order O 
in E is contained in the mazimal order, DF. See Borevich and Shafarevich (1966) for a 
more complete discussion. 

Example. Take E = Q(v2). Then OF consists of z + /2y, with z, y € Z. 
Another example of an order consists of z + 2V2 y with z, y € Z. 

We call OF, the ring of multipliers. Take €% to be the group of units of DE, of 
norm Ngole) = 1. By Dirichlet’s Theorem on units, this group is infinite unless E = Q 
or E is imaginary quadratic. Now suppose that po Æ 0 is in IN”. For e € aes we have 
N(euo) = N(uo) and epo E ME. Then if N (po) = m, the norm-form equation 


Nu) =m, HE Mm”? 


has infinitely many sołutions. 

So the condition of non-degeneracy is necessary to ensure the finiteness of the 
number of solutions. 

Example. Let K = Q(i, V2) and F(z,y,z) = N(x + iy + iv2z). Then M: z+ 
ty +iv2z. Now let E = Q(v2). We have MË : iy +iv2z = ily + V2z) and oF, = OF, 
the ring of integers in Æ. We know that EẸ is infinite and we have a unit € = V2-1. 

Any solution of Ne(y + V22) = +1 gives a solution iy + ivV2z € M of Niy + 
iv2z) = 1. Let us start with a particular solution of Ng(y + V22) = Ng(u) = 1, 
say with uo = 1 (so that yọ = 1, z = 0). By multiplication with powers of e we 
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obtain further solutions. Setting p: = poe’ = eé, we have py = € = V2 — 1 (so that 
yı = —1, zı = 1), w= (v2 — 1)? = 3 — 2V2 (so that y2 = 3, z2 = —2), etc. 

The reader may find a further discussion, especially about the degenerate case, in 
Schmidt (Lecture Notes, 1980). We return to the case where F is non-degenerate and 
consider bounds on the number of solutions. 


THEOREM 3A. (Schmidt (1986b)). If F(X) is a non-degenerate norm form of 
degree d with coefficients in Z, then the norm form equation 


F(z) =m, rez” 


has at most 
2307 d? 


d 


c1(n, d,m) 


solutions where c,(n,d,1) = 1 and 


ex(n,d,m) = Gay dn—i(m*), 


where w is the number of distinct prime factors of m and d,_;(@) is the number of ways 
of writing £ = €,...€n-1 with 4i >0 (¢=1,...,n—1). 

As was seen in Section 6 of Chapter III, the general case follows from the case 
m = 1. Thus, one may concentrate on the norm-form equation 


F(z)=1 


and the bound 


230n d? 


(3.1) d 


In these Notes, we will not prove Theorem 3A and (3.1), but a variation. See Theorem 
3B below. 

Incidentally (Schmidt (1989b)) has also proved another bound in place of (3.1), 
namely 

qa 
This second bound can probably be improved, removing one of the exponents by using 
some combinatorial techniques. The second bound is nicer for fixed n, since it grows 
only like a polynomial in terms of d. 

How can this be generalized to the degenerate cases? There we would have finitely 
many solutions up to multiplication by certain units. An explicit bound so far has not 
been derived. 

Also, what about asymptotic estimates for the number of solutions? The inequality 
|F(x| £ m defines some n-dimensional set of volume crm"/4 where cp depends on F 
only. Mahler (1934) has shown in the case n = 2, i.e. for the Thue case, that the 
number of solutions of this inequality is ~ cpm”™/4 as m + oo. Ramachandra (1969) 
proved this asymptotic formula for a class of norm form equations. 
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Now we will specialize the norm forms F somewhat. Let F be a norm form given 
by 
F(X) = LX)... LIX), 


and let E be a normal field containing the coefficients of L™,...,L°. Say that L 
has coefficients in a field K of degree d and Q C K C E where E/Q is normal. Let 
G = Gal(E/Q). Then G acts on the linear forms {L™,... , L(%} by acting on their 


coefficients. If 1 < t < d we say that G acts t times transitively if L™,...,L@ are 
distinct and if for any distinct integers 7;,... t+, there is ao € G with 
o(L) = ©) (l1sj St). 


If, for instance, the Galois group of K is Sa, the symmetric group on d elements, then 
G is d times transitive. 


THEOREM 3B. Let everything be as above. If G acts n — 1 times transitively 
and if any n among L™),... , L are linearly independent, then the number of solutions 
of F(x) = 1 is 

J <a, 


It follows that forms as in Theorem 3B are non-degenerate. This could also be shown 
directly. 


Example. Let K = Q(a), where a is an algebraic integer of degree d > 2, and 
suppose G = Sq. Take 


F(X) = 2(X1 +aX24+...+a7?Xq-1). 


Then the hypotheses are satisfied 

The class of equations treated in Theorem 3B includes Thue equations. 

The remainder of this chapter will be devoted to the proof of Theorem 3B. (There 
are some extra technical difficulties for Theorem 3A.) 


§4. A Reduction. 


Given two norm forms F,G we say F ~ G if F = GT for some T € SL(n,Z). 
Recall that G?(X) = G(TX). The number of solutions of F(z) = 1 is invariant under 
this equivalence relation. In Chapter III, Section 2, for a prime p, we exhibited certain 
transformations Ty,7T),... , Tp with det T; = p such that 


P 
Z" = U TZ”, 
i=0 


So instead of studying F(z) = 1, we could study the equations FT; (g)=1 (j = 
0,1,...,p). So what is the advantage of using F? ’s? From Chapter III, we know that 
D(FT) = (det T)! D(F), and then D(FT) > pil. The advantage, then, is that we can 
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consider forms whose semi-discriminant is large. The disadvantage is that we now have 
p+1 norm form equations to consider. 

Let N(n,d) be the maximum number of solutions to F(z) = 1 for all norm forms 
F of degree d in n variables of the type described in Theorem 3B. Let N(n,d,p) be 
the corresponding bound if we restrict to forms F with semi-discriminant D(F) 2 pl. 
Then 


N(n,d) S (p+ 1)N(n,d,p), 
which is Lemma 2C of Chapter III. By Lemma 2B of the same chapter, we have 


D(F) S H*(F)lln/4, 


Recall that I is the set of all n-tuples î1,... ,în in 1 S ij S d with LG)... , LC») 
linearly independent. (Under the hypothesis of Theorem 3B, J consists of all such n- 
types of distinct numbers.) Combining this last inequality with D(F) 2 pil, we see 
that we may restrict ourselves to forms with 


H*(F) > pil”, 


PROPOSITION 4A. Ifp >n”, then 
N(n,d,p) < d-in, 


We may deduce the main theorem (3A) from this proposition. Recall that we need 
to show that the number of solutions to F(z) = 1 is 


< a” . 


We pick a prime p with 


pion? 10n? 


<p<2n 
Then we have 

N(n,d) S (p+ 1)N(n,d, p) 

< wee 

< ae 


—10n? 5, 10n? 


since d >n. 
We may restrict still further. For a norm form F = aL")... L, we have height 
H*(F) = H(L)¢ cont F. In general, this height is not invariant under ~. We let 


F) = min H*(G). 
SCF) = min H*(G) 
This minimum exists, since among all forms L with coefficients in a given number field, 


there are only finitely many forms with H(L) < B for any bound B. Furthermore, this 
§(F) is invariant under ~. 
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So we restrict to norm forms F with §(F) = H*(F). We have seen that when 
D(F) 2 pl4l, we get §(F) > p*/". In the proposition, p > n!"”, so the inequality 
becomes 


(4.1) H(F) >n, 


How does this apply to the linear form L? In counting solutions of F(z) = 1, 
we may suppose that cont F = 1. Otherwise, we would have no solutions. Then (4.1) 
implies 
H(L) >n” 
In the sections which follow, we will distinguish large and small solutions. Small 
solutions will be those with 


|z| S$ (EEE = HPEY., 


The remaining solutions will be called large solutions. 


§5. An Application of the Geometry of Numbers. 


‘Let L = a, X14+...+anXp with a; € K, and write LO = al) Xıt...+ a) Xn, (ls 
i Sd). Let 
flo 
j 


al! 


and A=a,A...Aga_. Then A lies in C’, where £ = (7). If A = (A1,.-- , Be), introduce 


A(L) = |A] = V/|Ail? +... + [Bel?. 


Then it is easily seen that 


A(LT) = A(L) 
for T € SL(n,Z). 


LEMMA 5A. Given the notation above, 


V(r) 


lA 2 


d"/? (cont F)"-Y/4 g( F)1/4, 
where V(n) is the volume of the unit ball in R”. 

Notice that both sides are invariant under ~. The right-hand side depends only on 
F, but the left-hand side depends on how we write F = aL")... L(®. We could write 
instead F = a'L'® ... L{®) , where L' = AL. Then a! = a/N (A), but there is no simple 
way to express A(L’) in terms of À and A(L). 

The matrix 

(a) ET ARELES) 
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has rank n, so Qee 5a, are linearly independent. Let A be the set of all linear 
combinations 


21a, +...+ Tnt? 


where z; € Z. We will see that A may be interpreted as an n-dimensional lattice in 
some Euclidean space. 

Suppose that the field K has rı real embeddings and rz pairs of complex conjugate 
embeddings. Then rı + 2r2 = d. Suppose 


ar of 
is real for 1 Si Š r1, and 
ar al), ars alt?) 


are complex conjugate embeddings for r, +1 Si S rı +12. Let Ef be the space of 


vectors 
Zi 


2d 


where z1,... ,Zmņ are real and Zi, Zi+r, are complex conjugates for rı +1 Si S ri +r. 
Then E? is a vector space over R of dimension d. Introduce an inner product 


= 272, +... + 2424 


Ik 
x 


and a norm 


lal= yz Z 
on Ef, 
Exercise 5a. Show that, with this inner product, Ef is a Euclidean vector space. 
By inspection, a, € EŻ. Then we may interpret A as an n-dimensional lattice in 
Et, as mentioned earlier. Now we may consider successive minima. In other words, 


introduce ji1,... , Hn where py is the least positive real number such that there are j 
linearly independent points of A with |g] < uj. Here | | is the norm above. 


Ifz=u,a,+...+una@, € A, with components z®, then z = L(uy,... ,un)- 
We have 
jal [2 ... 2 | = |F(ay,... ,un)| 2 cont F, 


unless uj = ... = Un = 0. So every non-zero lattice point z has 
|2) ...z®] > (cont F)/al. 


The Arithmetic-Geometric Inequality gives 


lz] = Yle@MP +... + lz]? 


2 afd 4202... (2p 
> Vd4/(cont F)/|al. 
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We may conclude that 


(5.1) yr 2 Vd ¢/(cont F)/]al. 


Now suppose that bo Eis sb, is another basis of A. Say 


ge 
= : = mj a t...+Mjn a, 
ps” 
where the matrix (mjx) is in SL(n,Z). Introduce the row vectors 
af) = (af, a?) 
and 


B® = (BO, O Bi), 


Then we have . : . 
a =my af? +...4mj, a 
= alms, +... + alt) Min, 


so 6“) = a) Mt, where M* is the transpose of M. Let 


FM'(X) = F(M'X) 


r 


Since FM" ~ F, we have 
H*(F™’) 2 9(F), 
so that 


d 
la? JI 12 P 2 sry. 


Using the Arithmetic-Geometric Inequality, we get 
d N 
2 BOP 2 aH(F)/lal)"4. 
i=1 
Recall that 


BO =(6P,... B®) and b, = 
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Therefore 
2> 2/d 
(5.2) 2 IY Z d(H(F)/laly", 
IF 
which says that a basis b.,... ,6 can not be too small. 
Given our lattice A ead any basis b,,...,6 of A, there are linearly independent 


lattice points Grrr such that 
= =n 


lg | = m, |g2| = Ha,--- |g | = Hn, 
=] ee Zn 


where j11,... , Hn are the successive minima. For n = 2 these g, necessarily form a basis, 


but for n > 2, they are not necessarily a basis. However, one can show that there is a 
basis b ,. iz 52) with the property that 


MES (j =1,...,n). 


Exercise 5b. Verify this last statement. The reader may consult Cassels’ text on 
the Geometry of Numbers (1959). 
Given such a basis, we have 


Ears (Zo)nee 


If we combine this with (5.2) we obtain 


a di/2 i/a 
Hn 2 ay (A(F)/lal)'/*. 
n 
From (5.1), we also had 
Hj 2 d'/? (cont F/|al)!/4 (j =1,...,n-1). 
Taking the product of these inequalities we see that 


d”/2 S 
HiH- -Hn 2 n372 Jaje ra (cont F)” 1)/d HF). 


By Minkowski’s Second Theorem (2E of Chapter I), we have 
Hi --- Hn V (n) < 2” det A, 


so that ij 
det A 2 (n) 


= anh jajaa d"/? (cont F)®7d/ g(F)1/4, 


192 


But 
det A = | det a, ap? 


=|a,A...Aa|=A(L), 


and the proof is complete. 


§6. Products of Linear Forms. 


LEMMA 6A. Suppose F(X) = aL")(X)... L(9(X) is a norm form with coeff- 
cients in Z. Suppose z € Z” is a solution of 


F(z) =1. 
Then there exist i1,... jin with 1 Si, <... < in Sd such that 
Qn n3/2 


(a)... Lg) S [det(L),... 0) A(T. 


(n!)1/2 V(n) 


The significant term on the right hand side is § (F)~!/?. Why would:such a lemma 
be useful? Recall, in the case of the Thue equation, if ajz — ayy|...|2 —any| = 1, then 
[aiy — z| was small for some i. Then |aiy — z| |a;y — z| was small for any 7. We now 
have an analogue. Here we have n variables, and we can find some n of the linear forms 
such that their product is small at z. 


Proof. We have 
aL") (z)... L® (z) = 1. 


Introduce 
V(X) = L(X)/L(z), 
so that 
F(X) = V(X)... VO(X). 


Now apply the lemma of Section 5 with a = 1, cont F = 1, and V in place of L. We 
have cont F = 1 because we have an integer solution to F(z) = 1. The lemma gives 


Vin) an 
Av) 2 A a? (RM. 
Let 
ay) /L(z) 
a= ; 
=) 
al /L(z) 
Then 
V(n)? 


2 Ze 2 
AV)" = |g, A.--Ag l= lal 2 a3 


d H (F)?/4, 
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Now let D(t1,...in) be the determinant of the submatrix of (a?) /L®(z)) where 7 is 
among 21,...,%,- Then 


|A? = 5D Dli... in), 


l[i <...<in<d 


and the number of n-tuples in the sum is (4) < 4. Then some D(i1,... ,in) satisfies 
; , 4 n! 
IDin,- sin)? 2 voyn ary, 
and i 
; ; V(n)(n! 

|D(i1, ws stn)| 2 a H (F)/4, 
But ; ; 

been E a 


LGD (g)... LG) (x)? 


so we can get an upper bound for |L()(z)... L@»)(z)| in terms of | det(L@),... , LC). 


We have dee( Ga) 
; : det( L1)... , DM 
(i1) (in) Z ONA ee eh 
ILO)... LE) S > 


where 
B (FE (nt V(n) 
Qn n3/2 
In the case of the Thue equation, we used a Gap Principle to get a bound on the 
number of solutions. We will do a similar thing in the next section. 


P= 


§7. A Generalized Gap Principle. 


We first present a simple argument in diophantine approximations. Consider ra- 
tional approximations $ to a real number a with 


1 
Py?’ 


and say P 2 4. Given such reduced and distinct z1/y1,-.. ,£v/yv with yı S ... S yv, 
then 


Pia T; 2 
flan) + la- | < oat 
Yj-1Yj Yj-1 yl Pyj-a 
therefore 
P 
Yj > 3 Yi- 
and so 


yy = (P/2)’"". 
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The number of such approximations in reduced form with 1 S$ y S B is 


Sy Ee log B gago EE 


log(P/2) = log P 
We now want to generalize this. 


LEMMA 7A. Suppose Lı,... ,Ln aren > 1 linearly independent linear forms in 
n variables with complex coefficients. Suppose 


(nt S< PSB 
and put 
Q = (log B)/ log P. 
Then the integer points g in the ball |z| S B with 


\Li(z)...Ln(x)| S |det(Li,... ,Ln)|P™ 


lie in the union of at most 
weg! 


proper subspaces of Q”. 


Proof. The proof falls into two parts, the first being a reduction of the problem. 
(A) If L(X) = a1X1 +... +anXn = aX and M = BX, then let (L, M) = ap and 


|L| = /(L,L). We will reduce to the case where Lj,... , in are pairwise orthogonal, 
ie. (Li, Lj) = 0 fori £j. 
If Li(X) = a, X, then let 


Zi- = =n 


a _ ¢_4\i- =F or or 
â; =(-) a, N...Ka, Aa A. AG, 
which has n components since (,.",) =n. We have 


(7.1) gÂ, = = 6;; det(a.,, 


2a) 


where 6;; is the Kronecker symbol. This last statement is true because the coordinates 
of the â, are the minors of the matrix with rows a pits 1G,» 80 we get the determinant 
expanded about the ith row. 

The assertion of the lemma is invariant under replacing L; by the multiple 4;1j. 


We choose 
= 18,1/(1&,|--- 1a.) 
and replace 
a ria, 
so that 


a. F & ,/\&,- 
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After such a transformation, we may suppose that |â | =... = |â = 1. 
Suppose that we restrict our attention to solutions where 


|La) = max lil). 
It suffices to show that the number of subspaces required under this restriction is 


< 1 ian OF; 
n 
Write 
& =ca,+...+¢na,. 


We have 
n 
a å. =J ci 
=n =j 
t=1 
=C 


det(a,,-.. ,a,) 


I=n 


by (7.1). Looking at the left-hand side, we know that 


ja &|S1 
=n =j 
with equality when j = n. Thus 
lej| S |en| (j =1,...,n). 
Put 
a: =a [cn 
=q +. +e i+, 
with [et] $1, and put 
Li(X)=a' X 


We have 
det(Li,. os ,Dn-1, Ln) = det(Ly, see y Lasis L’) 


and L', is orthogonal to L1,... ,Ln-1. Also 
Lale) = ¢Li(z) +. Ln-1(z) + Ln(z); 


so that 
IL, (z)| S nlZa(2)I, 


since |ci| £ 1 and |Z,(z)| = max,<;<, |Zi(z)|. Then 
\Li(z).-. Ln—1(z)Ln(z)| S n|La(z)-.. Ln(z)| 


< Salads. ,Ln)|P~ 
= n{det(Zy,... ,L4,)|P7?. 
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Thus it suffices to show that when Ln ts orthogonal to Li,... ,Ln—1, then the solutions 
of 

lzi SB 
and 


|Zi(z)...Ln(z)| S n|det(Li,... ,Ln)|P™ 


lie in the union of not more than 
1 3n -—An-l 
—n 
neg 


proper subspaces. This is the key reduction. 
We repeat this argument. As before, we may suppose that |â | =... |â,„| = 1 and 
we restrict to solutions with 


|En—1(z)| = max |L:(@). 


1S$iSn-1 


Again, write 


Qai =c1 g +... + Cng, 


-1? 

where g does not appear since & _, is orthogonal to œ and the orthogonal complement 

of a is spanned by Qee _,- We have |c;| Slen-i1] (1 $j Sn-—1), and we take 
U e- 

da- = Kay /en-1 


=n— 


and 
L,-i(X) =a! _, Xx. 


Then 
[L,,-1(2)| S (n — 1)|Ln-1(z)]. 


Thus it suffices to show that when Ly is orthogonal to L; (i #n) and Ln- is orthog- 
onal to L; (i #n-—1), then the solutions of 


|z| £ B 


and 
|Li(2).-. Ln(z)| < n(n — 1)| det(Li,..., La)| P? 
lie in no more than 1 
3n —n-l 
n(n — 1)” Q 
proper subspaces. 
Applying this argument repeatedly, we get the following reduction. If Li,... , Ln 
are pairwise orthogonal, then the solutions of 


lz| SB 
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and 
[Ln(z).-- Ln(z)| < n! |det(Li,...,L,)|P7! 
lie in at most i 
(2n?)"-} qr < = n?” Q”! 


proper subspaces. 
We may reduce even further by supposing that (L;,L;) = 6;;. To do so, just 
multiply by suitable factors to make |L;] = 1 for 1 Si S n. Then |det(Li,... ,Ln)| = 1. 
(B) Having completed the reduction process, we study solutions z to 


|Z(2)... Ln(z)| < nl P7, 


|z| S B 
and therefore 
|Li(z)| $B (i =1,...,n). 
Put 
C= (P/(m} "9, 
so that 
|La(2)-.. La(g)| < 1/(n! €”), 
and put 


R = log(n! B” )/ log C. 


We subdivide the solutions into several classes. Either for some i in 1 S i S n — 1 we 
have 


|L:(2)| < BC"? = B'"/nl, (Ei) 


or we have 
BOTZ! < |Li(z)| < BC? (i = 1,...,n— 1) (Epi, Pn- 


for integers pi,... ,Pn—1 With 0 S p; < R. If z is a solution in one of the first classes, 
say E;, then one of the first n — 1 forms, namely L;(z), is really small at z. The second 
set of classes covers the cases where this is not true. _ g 

Let £oa be solutions in some class E;. Then 


|det(z,,.-. Zz, )I = |det(Z1,... ,DZn)| | det(z,,..- a) 


= det Dy(z. 
E oe e(z,)I 


<n! Br. = p=], 
n! 
Then det(z,, ...,£_) = 0 and any n solutions in this class are linearly dependent. In 


other words, all of the solutions in a given E; must lie in a single subspace. So counting 
solutions in Ey,... ,En-1, we have at most n — 1 subspaces. 
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Now consider solutions in one of the second classes. We have 


|L1(z) ... Ln(z)| < 1/(n! C=), 


so ee 
CP. oT Pn-1 
|Ln(z)| < Bea 
Let z,,.-- ,£, be solutions in this class, called Ep, ,....p,-1- Then 
[det(z,,...,2,) = Lae Li(z,)| 
1 CPit--+Pn-1 
I pr- —=pPi—...—Pn-1 
<n!B™ C B 
= 1, 
and z „£, are linearly dependent, as before. In this case, how many subspaces 


=1’° 
may we have? "In other words, how many classes can occur? We have py,... ,Pn-1 with 


0 <p; < R, which gives no more than 


(Ray? 
possibilities. Recall that 
_ log(n! B”) 
~ lgl ’ 
and by hypothesis 
n! B” < Bt, 


We also have 


1/(n—1) 


by hypothesis. Combining these gives 


log B”+! 


R< log P1/(2n—2) 


= (2n? = 2)Q, 


and we have 


(R +1)" S (Qn? - 2)Q +1)" 


subspaces. 
All together, the total number of subspaces is 


Sn—1+((2n? pg 
S (QT, 


which is what we asserted. 
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§8. Small Solutions. 


In Section 6, we saw that |F(z)| = 1 implies that for some 7) <... < tn, we have 
(8.1) [IE (£)... LG) (g) < [det(L,... , L69) P, 


where 
z V (n)(n!}/? 


1/d 
Bea AES 


Exercise 8a. Show that 
(n!)/? V(n) > 1/(2n)"/?. 


Using this exercise, we get 


P H (F) $ > (FYL > (nl), 


1 
> — 
(2V 27)" 3/2 


where the last two inequalities follow since H (F) > n1°"4 by (4.1). 
As we mentioned previously, small solutions will be those with 


Izl S 9 (F). 


Let this bound be B. By Lemma 7A, the small solutions satisfying (8.1) lie in no more 
than 


n?” (log B/log Py? 


subspaces, and we have 


n n-1 
n?” (log B/ log Py < no” (= ren) 


(log H(F))/2d 

< 12” nin qv g 
Counting the number of possibilities for the n-tuple 1 < 71 < ... < in £ d, which is 
(2) < d”, we get the following result. 


PROPOSITION 8A. Under our hypotheses, the small solutions of 
|F(z)| = 1 


lie in the union of at most 
12” nin qr tn 
proper subspaces. 


Note that for small solutions we did not need non-degeneracy or the hypothesis of 
Theorem 3B, but only the fact that the matrix of L®,... , LO) is of rank n. 
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§9. Large Solutions. 
We will count the large solutions only in the special case considered in Theorem 


3B. 


LEMMA 9A. Let Ly,... , Ln ben linearly independent linear forms in n variables 
with coefficients in a number field E. Using linear independence of L1,... ,Ln, write 
each variable as 


Xi = ya La +... + YinLn (i=1,...,n) 
with yi; € E. Then 


lyiz| |L;| S$ He(Lı)... He(La). (i,j =1,...,n) 


Proof. In fact, for any absolute value v* € M(E), we claim that 
lijloe |Ljloe S He(L1)... He(La). 
Write 


and put 


For a € E*, we have the product formula, namely 


lale JJ lal =1. 


vEM(E) 
Writing a = det(@,,... ,& ,), we get 
1 ni 
a EEE E ee 
| det(a,, ake 1, lve vEM(E) 
(9.1) S JT (lelle, 
vEM(E) 
_ Hela)... Hela,) 
|, |v» oe lax, |e 


where we have used Hadamard’s Inequality. 


Now let a, = (ai1,.-. ,@in), A = (aij), C = (qij). Then AC = I. Notice that 


Vij 
the a. are simply the rows of A, while y = | : are the columns of C. Thus, up to 
= = . 
g Ynj 
sign, 
1 
y = a A ^a. ^a A Aa 
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Now we have 


< læ |v cee |æ lv» 


+ e at < cee 
lo lv ly |» = | det Ajue = Hegla) Hegla) 


by (9.1). Looking at the ith components of the vectors y on the left, we have 
=j 


|Liloe lyijloe S Hela)... Hele) 


as claimed. 
Now suppose that z is a solution of F(z) = aL® (z)... L®(z) = 1. 


LEMMA 9B. If z is a large solution in Theorem 3B, then there are indices 1 < 
ii <... <in Sd such that 


(9.2) |LED (g)... LG") (2)| < |det(L),... , LE) |z. 


Proof. It is convenient to normalize the forms L(®?. That is, let 
MOX) = LOK) /|L|, 


so that |M®| = 1 (i = 1,...,d). (Notice that the M G) are no longer necessarily 
conjugate forms.) We have from F(z) = 1 that 


1 1 
jal (LO... [LO] ~ ECF) 


IMM(z)...MM(z)| = 


IA 


1. 


Without loss of generality, 

MYE) S$... S IMPE) 
Since any n linear forms in Theorem 3B are linearly independent, we may express 

Xi= a LP H... + Yin d”. 
Then by the preceding lemma, 

Lvs] [LP] S He(L™)... He(L™), 
where E is a field containing all of the coefficients of L™,... , L(). Then deg E S d”, 
and 
Ivl [ZL | S HA. 


Now express the variables X; as linear combinations of M®,... ,M®™® , say 


Xi = cy MW) Ares +cinM™, 
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where ci; = y| LO |. Then 
lal SHJ”. 


We can now obtain an estimate for |z|, which we will use to complete the proof. 
We have 
lz| $n? H(L)"" |M™ (2). 


Since 


IMM(z)S... S IM® (2), 
the upper bound on |z] gives 
CE a 
Mt) (x)... MO (x)| > (aaa) y 


Then 
IM (2)... MP (a) S1 


leads to 


n \ d- 
IM® (2)... M2) S (ean) 


|z] 


Combining this last inequality with 


IMO) = LOE /®1, 


we have 
n\ d-n 
IEO (2). LE) c EO. (PELA 
|det(L@),...,L())| = [det(L@,... , L)| lz] 
n d— 
: n? A(L) 
|z] 

since 


ILD]... |LO| 
|det(L®,... L] 


from the proof of the last lemma ((9.1) and the fact that deg E < d”). Now, large 
solutions satisfy 


< H(L)"*) 


|x| > H(L)e""™* > ntan, 
(see the reduction of section 4), so 


ILM (2)... L™(2)| "T ei 
[det(L®,..., L)| 
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Remark. We have not made full use of the restriction |z| > H (L)°"4"*" for large 
solutions. Fuller use is made in the proof of Theorem 3A, which however is not presented 
here. 

Now we apply the Subspace Theorem (Theorem 1D) with 6 = 1/2. The conclusion 
is that there exist subspaces S),... , Se where 


t= [eae] Saye, 


such that solutions to (9.2) lie in the union of S1,... , S+ and the ball 
|z| S max((n!)*/*, H(L)). 
Since our large solutions can not lie in this ball, we have the following result. 


PROPOSITION 9C. Under the hypothesis of Theorem 3B, the large solutions 
to F(z) = 1 lie in the union of at most t proper subspaces, where 


t =(2d")*?""”, 


§10. Proof of Theorem 3B. 


Combining the results of Sections 8 and 9 on small and large solutions, we see that 
the solutions to the norm form equation F(z) = 1 lie in the union of not more than 


aay? 


subspaces. 
We want to count the number of solutions. We suppose that S is one of the 
subspaces and that S is given by a parameter representation 


= Ty, 


18 


where y € Q”-! and T : Q"—! — Q". If T is properly chosen, as y runs through grar 
then z will run through the integer points of S. E 
S 
Zim 


To study solutions z € S, we need to study F(Ty) = 1. Letting L*(y) = L(T(y)), 
we have p 7 7 


(10.1) aL (y)... L® (y) = 1, 
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a norm form equation in n — 1 variables. This allows us to do a proof by induction. 

We would like to apply Theorem 3B to the new linear form L*. By hypothesis, G 
was n — 1 times transitive on L®),... , L(®, so then G is n — 1 times transitive, hence 
n — 2 times transitive on L*™),... ,L*®, Also, the rank of the forms L*®),... , L*(%) 
is n — 1. So then there exist n — 1 among them which are linearly independent. By 
(n—1)-transitivity, any n— 1 among them are linearly independent. So both hypotheses 
are satisfied, therefore (10.1) has 

< gee» 


solutions in S. Multiplying by the number of subspaces we get 


19260 80(n—1) 30n _ 2 
< qion2 5 a < a 10n 


solutions (with plenty to spare). This proves Proposition 4A, and therefore Theorem 
3B. 


Epilogue. The abc-conjecture. 


Let a,b,c be non-zero integers with 


a+b+c=0 and ged(a,b,c)=1. 


P= II r 


plabe 


Put 


J. Oesterlé posed the following question. Is there an absolute constant cı such that 
max(|al, |b}, jel) £ P% ? 


Masser (1985) refined this question. He conjectured that for any e > 0, there exists a 
constant c2(e) such that 


max (Jal, bl, |¢l) < e2(e)P***. 


This is known as the abc-conjecture. We will discuss consequences of the abc-conjecture. 
Our discussion will follow, to a large extent, a paper due to Stewart and Tijdeman 


(1986). 
M. Hall Jr. (1971) conjectured that 


|x? _ y*| > c3y)/? 


for positive integers z,y with z? 4 y3. A weaker version of Hall’s conjecture follows 
from the abc-conjecture. To see this, let d = gcd(x?, y*), and then set a = z?/d, b = 
—y?/d, c = (y? — 27)/d. Then 


P= JĮ rs < zyl? - z?°| z 


plabe 


The abc-conjecture gives for any e > 0 that 
|b] = y? /d < cale) P! 
and 
a= z? /d < co(e)P'**. 
Multiplying these inequalities, we get 
z?y? /d? < cale)? P2+% 


y3 2 gaere 


2 .2+2€  2+2¢€ 
) z y q2+2« 4 


< cafe 


and then 
sty? <e Ox gi tte y2+2 ly? — r? 
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Now we have i 

3)2+2€ —2e ,1-2e 
> 

ee eee 


|z? ~y 


If z < 2y”, then 
|z? = y? [2t > c4(€) ye, 


and therefore 
jz? - žl > cs(€) yt 6) /(2 +26) 


So, again for every e > 0, 
le? — | > cele) y=, 


On the other hand, if z > 2y?, we get 
jz? -y| 2 y$. 
So a weak form of Hall’s conjecture follows, namely 
|z? — y®| > ce(e) ylt/2)—e 
This has the following consequence concerning a particular elliptic equation 
y? =r? +k, k £0, 
called the Mordell equation. Hall’s conjecture gives 
[el = |y? — 2°] > cele)? 
Thus every solution to a given Mordell equation has 


|a| < c7(e)[kI? 7°. 


Next we consider the Fermat conjecture. That is, we look at the equation 


z” +y” = 2” 


where 
n23, ged(z,y,z) =1, and z,y,z>0. 


To apply the abc-conjecture, let a = z”, b = y”, c = —z". Then 


P= [[ ps2, 
plabe 


since z is the largest of the three integers. By the abc-conjecture, we have 
z” = Jel S cale) P’ S cale)? 


so 
gh 3—se < c2(e). 
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Now suppose that n 2 4 and e < 1/3. Since n — 3 — 3e > 0 in this case, we get a 
bound on z as well as a bound on n. Thus, for sufficiently large n, Fermat’s conjecture 
is correct. In other words, Fermat is correct with at most finitely many exceptional n. 
And for any given n, there are still only finitely many solutions. 

The weaker Oesterlé conjecture also has this consequence about the Fermat Con- 
jecture. However, in that case, we obtain a larger lower bound on n, so we get more 
possible exceptions. We then also need Falting’s Theorem (1983) to deal with small n. 

The following conjecture was made by Pillai (1945). Given integers A > 0, B > 
0, C > 0, the equation 

Ar” — By” =C 


in integers s > 1,y > 1,n > 1,m > 1 and with (m,n) # (2,2) has only a finite 
number of solutions. If m,n were fixed, this would be a special case of an algebraic 
diophantine equation, the superelliptic equation. The conjecture has been proved only 
for A = B = C =1, by Tijdeman (1976). This special case 


a —y"™=1 


ties in with Catalan’s conjecture, namely that this equation has no solutions except 
3? — 23 = 1. Tijdeman showed that all solutions have z,y,m,n < B for some explicit 
B. 

To apply the abc-conjecture to Pillai’s conjecture, let d = gcd(Ar", By™, C). Set 
a= Azr"/d, b= —By™/d, and c = —C/d. Then 


PS ABCzry, 
and the abc-conjecture gives 
Az" /d, By™/d S co(e)P'**. 


So 
max (x",y™) S cg(A, B,C, e)(zy) tE. 


Without loss of generality, z” < y™, so 
y™ S cs (A, B,C, ejy0+0/)0+9, 


If m 2 3, then 1+ (m/n) S 1+ (m/2) S 3m, so (1 + (m/n))(1 + €) < $m for e > 0 
sufficiently small. Then 


yl? < yO) < eg(A, B,C, €) 


gives an upper bound for y™, hence for y and for m. On the other hand, if m = 2, then 
n 2 3 and (1+(m/n)) S 1 + (m/3) S 3m, and the rest follows as before. 

A more general application is as follows. Tijdeman (1989) proved that for given 
non-zero integers Á, B,C the diophantine equation 


Az” + By™ = Cz! 
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has only finitely many solutions in positive integers z > 1, y > 1, z > 1, n, m, £ subject 
to ged( Ax, By,Cz) = 1andn-!4+m7!+é7! < 1. On the other hand, Hindry has shown 
that for each triple n,m, £ with n71 +m! +27! 2 1, there exist A, B,C such that the 
above equation has infinitely many solutions z, y, z with g.c.d. (Az, By,Cz) =1. 
Last, we consider S-unit equations over Q. Let K = Q and S = {c,pi,... , py}. 
Consider the equation 
z+y+z=0, 


where 
z,y,z€UsnZ and ged(z,y,z) =1. 


To apply the abc-conjecture, let a= z, b = y, c= z. Then 
P Sp... Py 


and 
izl, Iyl, lz] S c2(e)P*+ 


The abc-conjecture may be an elusive goal. What do we know in this direction? 


THEOREM 1. (Stewart and Yu, (to appear)). Under the hypothesis of the 


abc-conjecture, 


max (lal, bl, lel) < eP'/*+ee/ log P, 


This improves upon an earlier bound due to Stewart and Tijdeman (1986)). 


THEOREM 2. (Stewart and Tijdeman, 1986). In this setting the conjecture 
would be false without the e. In other words, it is not true that 


lal, |B), lel < c10P. 


More precisely, given 6 > 0, there are infinitely many positive integers a,b,c with 
gced(a, b,c) = 1 anda = b + c such that 


log P 
a > Pe’) ET, 


where P [| p. 
p|abc 
We have the following generalization to n variables. Suppose 


ay tag+...t+a, =0, 


where the a; are non-zero integers, gcd(a;, aj) = 1 for i # j, and no sub-sum vanishes. 


Let 
P= Il p. 


Play...0n 


The conjecture is that 


max(|ai|,... , lan|) < cxa(n, Ej ime 
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